Reading-eScience-Centre / edal-java

Environmental Data Abstraction Layer libraries
Other
39 stars 30 forks source link

Large netCDF-4 file reading strategy #69

Closed PeterWarren closed 7 years ago

PeterWarren commented 8 years ago

We are using thredds (ncwms currently but soon to be edal-java) to render wms layers of large (64GB) NetCDF-4 files. To avoid hitting out of memory errors we need to ensure the netcdf reading strategy is set to SCANNLINE. Currently, the reading strategy chooser (getOptimumDataReadingStrategy) only selects SCANNLINE if the file type is "netCDF" or "HDF4". Our files are "NetCDF-4" so the chooser falls-back to BOUNDING_BOX reading strategy and thredds quickly exhausts even very large memory allocations.

To avoid this we have patched our ncwms (thredds 4.6) to look for "NetCDF-4" type files and force them into SCANNLINE mode. We would now like to find a more permanent solution for thredds 5.0 and onwards.

I have 2 proposed solutions:

  1. Added NetCDF-4 to the types that go into SCANNLINE mode as we have done previously. However, NetCDF-4 can be compressed and comments around getOptimumDataReadingStrategy suggest that compressed files should be read with BOUNDING_BOX.
  2. Implement the todo in the getOptimumDataReadingStrategy method to "also use the size of the grids as a deciding factor" and choose a size in MB or data points to change to SCANNLINE. Perhaps this solutions is more appropriate because it addresses the real issue which is the size of the file not its type.

(1) is trivial so I wont provide any code for it. I had a go at implementing (2) (attached bellow). I assumed that all NetcdfDatasets could be considered gridded datasets, I am not sure if that's safe? And I calculated the size of the dataset by taking the product of all dimensions.

try (GridDataset gridDataset = getGridDataset(nc)) {        // assume gridded dataset (possibly unsafe?)
    for(GridDatatype grid : gridDataset.getGrids()) {
        long totalsize=1;
        DataType dt =grid.getDataType();
        int datapintsize = dt.getSize();        //could use the data point size in the estimate
        for(Dimension dim :grid.getDimensions()) totalsize*= (long)dim.getLength(); // take product of all dimensions
        if (totalsize > (100 * 1024 *1024) ) return DataReadingStrategy.SCANLINE;
        /* 100MB of single byte data or 400MB of int32 or float data */
    }
} catch (DataReadingException | IOException e) {
    /* Ignore exception and try to choose a reading strategy based on file type. */
}

Please let me know what you think. NetCDF-4ReadingStratPatch.zip

Thanks

guygriffiths commented 8 years ago

This looks good. I've modified it slightly so that it uses the DataType size (from your code), and compares against a multiple of the maximum available memory. This will make it to the next release of ncWMS2, and should hopefully be in a subsequent TDS release.

PeterWarren commented 8 years ago

Thank you Guy. When testing this I also noticed there are 2 very minor numeric overflow bugs in DerivedStaggeredGrid.size() and RectilinearGridImpl.size(). One or both integers used need to be cast to longs before multiplying. eg. return (long) xAxis.size() * (long) yAxis.size();

guygriffiths commented 8 years ago

Great, thanks, I've fixed that one too.

guygriffiths commented 8 years ago

So after some testing, it turns out that this is having a very detrimental effect on displaying data from large datasets - SCANLINE is a lot slower for compressed data, and this change is picking SCANLINE for datasets which really don't need it.

I've changed the code so that only the size of the horizontal grid is taken into account. That's all that DataReadingStrategy applies to anyway, so this should give a more realistic estimate of the amount of data which needs to be read, and should only choose SCANLINE in cases where it's really necessary to avoid OutOfMemoryExceptions. Once I've confirmed that it's all working properly, would you mind testing with your dataset to make sure that SCANLINE is still chosen?

adamsteer commented 8 years ago

Hi Guy - do you have a compiled ncwms jar containing your change that will work with thredds 4.6? If so be really interested in testing it, we see similar issues on a TDS which uses the scan line reading modification. I'm not a Java developer, so grabbing a compiled jar is easiest - otherwise I'll try to compile one. Thanks!

On 30 Sep 2016 21:18, "Guy Griffiths" notifications@github.com wrote:

So after some testing, it turns out that this is having a very detrimental effect on displaying data from large datasets - SCANLINE is a lot slower for compressed data, and this change is picking SCANLINE for datasets which really don't need it.

I've changed the code so that only the size of the horizontal grid is taken into account. That's all that DataReadingStrategy applies to anyway, so this should give a more realistic estimate of the amount of data which needs to be read, and should only choose SCANLINE in cases where it's really necessary to avoid OutOfMemoryExceptions. Once I've confirmed that it's all working properly, would you mind testing with your dataset to make sure that SCANLINE is still chosen?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/edal-java/issues/69#issuecomment-250720329, or mute the thread https://github.com/notifications/unsubscribe-auth/AMel1mmJe1gzz_BHV6Gm4O-u5TfhTVWdks5qvPAGgaJpZM4KAInp .

PeterWarren commented 8 years ago

Thanks again Guy. I'll backport your patch into 4.6 for Adam and test the current master branch.

adamsteer commented 7 years ago

just catching up here - what's the best way to go about testing this patch - is it part of any edal-java release yet (wondering if i should leap ahead to TDS 5 at this point)? and/or where can I grab a compiled ncwms.jar file containing the patch for TDS 4.x? Thanks

guygriffiths commented 7 years ago

@adamsteer - Yes, this will have made it into any recent edal-java release, and so should be available in the latest TDS 5 builds. @PeterWarren would be better placed to tell you whether this is in any 4.x version of TDS