aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

Support NetCDF files bigger than 2Gb with OPeNDAP #410

Closed ggalibert closed 5 years ago

ggalibert commented 5 years ago

In order to properly support OPeNDAP for files bigger than 2Gb, one has to enable Large File Support by creating the NetCDF file in a 64-bit offset format (as opposed to classic format). See https://www.unidata.ucar.edu/software/netcdf/faq-lfs.html .

The following files are currently impacted and need to be re-produced with the correct format (please update if required):

ggalibert commented 5 years ago

For example one should be able to use nccopy to produce a subset of the dataset available via OPeNDAP. In the example below I am requesting only one variable:

ggalibert@ggalibert-Latitude-E7450:/tmp$ nccopy -v TEMP_90th_perc http://thredds.aodn.org.au/thredds/dodsC/CSIRO/Climatology/SSTAARS/2017/SSTAARS.nc SSTAARS_TEMP_90th_perc.nc
NetCDF: One or more variable sizes violate format constraints
Location: file /build/netcdf-StLR0y/netcdf-4.4.0/ncdump/nccopy.c; line 1449

If successful the retrieved file would be ~140Mb.

ocehugo commented 5 years ago

@ggalibert ,

The actual file in the bucket is in netcdf-4/hdf5 format.

I don't think this is related to LFS and the netcdf format. I could subset the files with NCO in the opendap for the SSTAARS file.

ncks -D9 -d LONGITUDE,131.,132. -d LATITUDE,-39.,-35 -v TEMP_10th_perc http://thredds.aodn.org.au/thredds/dodsC/CSIRO/Climatology/SSTAARS/2017/SSTAARS.nc x.nc

Also, you forgot to put "-4" in the nccopy call. If not provided, it will always try to create a local netcdf-3 file. Given that the remote file is larger than 2gb, it will fail before even trying.

The problem related in the email from an user smells like a timeout problem. I had to wait ~10s for the tds dods file page to open [which indicates the file is now at local level in the TDS server]. After that I can query the opendap quickly.

If the timeout to get this file is somewhere lower than that, then there is not enough time for the file to be harvested by the TDS server. Worse even, there will be several calls in the sequence that will never complete...