Closed gspowley closed 2 months ago
A utility script to provide the size of an array broken down per column, optionally displaying the filters.
$ array-size.py array_uri name type data offsets validity total percent 0 fmt uint8 8483150 66593 0 8549743 50.064 1 info uint8 5989054 54952 0 6044006 35.391 2 qual float32 529503 0 0 529503 3.101 3 end_pos uint32 455988 0 0 455988 2.670 4 real_start_pos uint32 455918 0 0 455918 2.670 5 start_pos* uint32 441214 0 0 441214 2.584 6 alleles bytes 329757 105369 0 435126 2.548 7 fmt_GT uint8 119494 9184 0 128678 0.753 8 filter_ids int32 10569 9184 0 19753 0.116 9 id bytes 3887 9208 0 13095 0.077 10 sample* bytes 2862 216 0 3078 0.018 11 contig* bytes 1350 216 0 1566 0.009 Total size: 16.29 MiB
with filters:
$ array-size.py --filter array_uri name type filter data offsets validity total percent 0 fmt uint8 (ZstdFilter(level=4), ChecksumSHA256Filter()) 8483150 66593 0 8549743 50.064 1 info uint8 (ZstdFilter(level=4), ChecksumSHA256Filter()) 5989054 54952 0 6044006 35.391 2 qual float32 (ZstdFilter(level=4), ChecksumSHA256Filter()) 529503 0 0 529503 3.101 3 end_pos uint32 (ByteShuffleFilter(), ZstdFilter(level=4), ChecksumSHA256Filter()) 455988 0 0 455988 2.670 4 real_start_pos uint32 (ByteShuffleFilter(), ZstdFilter(level=4), ChecksumSHA256Filter()) 455918 0 0 455918 2.670 5 start_pos* uint32 (DoubleDeltaFilter(reinterp_dtype=None), ZstdFilter(level=4), ChecksumSHA256Filter()) 441214 0 0 441214 2.584 6 alleles bytes (ZstdFilter(level=4), ChecksumSHA256Filter()) 329757 105369 0 435126 2.548 7 fmt_GT uint8 (ZstdFilter(level=4), ChecksumSHA256Filter()) 119494 9184 0 128678 0.753 8 filter_ids int32 (ByteShuffleFilter(), ZstdFilter(level=4), ChecksumSHA256Filter()) 10569 9184 0 19753 0.116 9 id bytes (ZstdFilter(level=4), ChecksumSHA256Filter()) 3887 9208 0 13095 0.077 10 sample* bytes (DictionaryFilter(), ZstdFilter(level=4)) 2862 216 0 3078 0.018 11 contig* bytes (RleFilter()) 1350 216 0 1566 0.009 Total size: 16.29 MiB
A utility script to provide the size of an array broken down per column, optionally displaying the filters.
with filters: