astronomy-commons / hipscat

Hierarchical Progressive Survey Catalog
https://hipscat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
17 stars 3 forks source link

Optimize histogram binning to not produce large empty tiles #302

Closed hombit closed 2 months ago

hombit commented 2 months ago

Feature request

Currently, count histogram binning works in a greedy way: it merges tiles until 1) the total count is bellow a threshold, 2) minimum order (default is 0) is reached. I believe we can do it smarter and do not merge it further if all tiles but one have zero counts.

It may produce too many partitions for some edge-cases, e.g. very clumpy data in a catalog composed from many individual pointings. However these edge-cases can be solved with a smaller maximum order.

Before submitting Please check the following: