astronomy-commons / hipscat-import

HiPSCat import - generate HiPSCat-partitioned catalogs
https://hipscat-import.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Row count checks on final parquet, _metadata and ancillary files #373

Open nevencaplar opened 1 month ago

nevencaplar commented 1 month ago

This is part of the verification pipeline tickets, connected with https://github.com/astronomy-commons/hipscat-import/issues/344

Implement row count checks on

  1. Final Parquet Files ● Get row counts from file footers. ● Compare total with truth. ● Compare per partition with intermediate files.
  2. _metadata File ● Get row counts from _metadata file. ● Compare total with truth. ● Compare per partition with intermediate files.
  3. Ancillary Files ● Check numbers in all ancillary files. ● Total in README. ● Counts per file/partition in csv files