ACDguide / BigData

Working with big/challenging data collections
https://ACDguide.github.io/BigData
Other
5 stars 5 forks source link

to add to chunking #63

Open paolap opened 2 years ago

paolap commented 2 years ago

As an example using eco to fix a file chunking the might be useful:

ncks --fix_rec_dmn time --cnk_plc nco infield outfile

Sometimes you have files that are the product of concatenation along the time axis, but the record dimension is still "unlimited" this can make the file artificially bigger as the time will have chunks of size 1. Just fixing the record dimension is not sufficient you also have to make sure the file is rechecked. If you don't have a clear preference on how to recheck the file then you can select the NCO default by passing --cnk_plc eco This recheck the file using the standard NCO approach