steo85it / pyxover

Altimetry Analysis Tools For Planetary Geodesy
GNU General Public License v3.0
3 stars 3 forks source link

function call to compress/decompress input "range dataset" #17

Open steo85it opened 4 months ago

steo85it commented 4 months ago

Prepare and include in the processing pipeline 2 steps to compress/decompress laser altimetry ranges .TAB file. First check how much can be gained.

wdesprats commented 4 months ago

Before talking compression, I quickly checked how much we would gain from storing .TAB files to parquet files. In case of BELA, one file contains 35.4K entries, which amounts to 4.6Mb, even after removing most of the unused columns. Converting them to parquet reduced the size of this file to 2.2Mb.

Now if we want to simply compress the TAB files, and uncompress them on the fly, 1 month of BELA data was compressed from 1.7GB to 568MB.

For one year of data, reducing the 20Gb of raw altrimetry ranges to 10Gb parquet or compressing the data to 6.7Gb would definitely be a plus.