astronomy-commons / hipscat-import

HiPSCat import - generate HiPSCat-partitioned catalogs
https://hipscat-import.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

High memory usage for Finishing stage #364

Open delucchi-cmu opened 1 month ago

delucchi-cmu commented 1 month ago

Bug report

When importing the large and wide PanStarrs detections table, STSci reports needing 41GiB of memory to complete the Finishing stage.

The Planning, Binning, and Finishing stages currently run inside the controller job, instead of inside dask workers. It can be trickier to add more memory to the controller (for various reasons).

Profile the Finishing stage to look for possible memory hogs and fix them.

Before submitting Please check the following: