Open smcguire-cmu opened 6 months ago
@smcguire-cmu Can you remind me, is this still active?
So I looked into this a while back. I switched out our join and crossmatch implementations to use map_partitions instead of delayed and saw some improvement in lazy operation time, but not orders of magnitude improvements.
Advantages:
Disadvantages:
Instead of using dask delayed to align and map over the partitions of our catalogs, we could try to use the
ddf.partitions
accessor to align the partitions as necessary and map_partitions over them. There are still questions over how we deal with empty partitions and divisions, but may be an approach to look into.