astronomy-commons / hipscat-import

HiPSCat import - generate HiPSCat-partitioned catalogs
https://hipscat-import.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Reimport ZTF with larger ~100 mb parquet file sizes. #237

Closed nevencaplar closed 6 months ago

nevencaplar commented 6 months ago

Reimport ZTF with larger ~100 mb parquet file sizes (on epyc, PSC). This is to reduce the number of partitions.

delucchi-cmu commented 6 months ago

Note ztf_zource instead of ztf_source.

catalog_dir = "/data3/epyc/data3/hipscat/catalogs/ztf_axs/ztf_zource/"

healpix orders: [2 3 4 5 6 7 8 9]
num partitions: 41679
------
min size_on_disk: 0.00
max size_on_disk: 0.46
total size_on_disk: 8343.36

small-ish   : 2779  (6.7 %)
sweet-spot  : 38900     (93.3 %)

and a histogram of leaf parquet file sizes (units are GB) image