DFO-Ocean-Navigator / netcdf-timestamp-mapper

Maps timestamps (and variables) to a corresponding nc file using sqlite3.
https://dfo-ocean-navigator.github.io/netcdf-timestamp-mapper/
GNU General Public License v3.0
1 stars 1 forks source link

Increase file IO speed #7

Closed htmlboss closed 4 years ago

htmlboss commented 4 years ago

There may be an opportunity to speed up nc file reading by playing with some block size parameters: https://www.unidata.ucar.edu/software/netcdf/docs/group__datasets.html#gaccbdb128a1640e204831256dbbc24d3e

or

https://www.unidata.ucar.edu/software/netcdf/docs/group__datasets.html#ga88d20343e6357a33e973ba735052da74

htmlboss commented 4 years ago

There's also the consideration of using the netcdf-c library instead of the c++ library. The former has a much larger set of interfaces and parameters to play with than the latter.

We also need to check our netcdf files for the following attribute: _NetCDF4Coordinates. This long chain explains why: https://github.com/Unidata/netcdf-c/issues/489

htmlboss commented 4 years ago

After examining some metrics provided by Linux (pmap, etc.), there is nothing else to be optimised in terms of memory or IO. Any slow-ness is a result of drive rotation speed; this tool has next to 0 overhead. The netcdf libraries are using chunking to load the files from disk which is keeping memory usage very low, even with large files.