Current implementation of direct computation, using scipy cKDTree is fast enough for PIP weights (as only DD counts are required) but not for angular upweights (as DR counts are required) --- compared to the FFT part.
Different options:
extend cKDTree.query_pairs to cross-correlation
use numexpr to speed up computation of spherical Bessel functions and Legendre multipoles
write the loop over pairs in C / the whole tree in plain C (starting from cKDTree implementation?)
Current implementation of direct computation, using scipy cKDTree is fast enough for PIP weights (as only DD counts are required) but not for angular upweights (as DR counts are required) --- compared to the FFT part. Different options: