lutraconsulting / MDAL

Mesh Data Abstraction Library
http://www.mdal.xyz/
MIT License
160 stars 50 forks source link

When importing ECMWF / Copernicus NetCDF files MDAL seems not to be applying scaling factors and offsets. #107

Closed ClintBlight closed 5 years ago

ClintBlight commented 5 years ago

Hi,

I've just been helping a colleague use R to download and work with some reanalysis datasets from ECMWF and Copernicus. I thought I'd also try using the MDAL import option to load them into QGIS 3.6 on Windows. I was able to successfully load in new MDAL layers within which there were 10 metre wind groups. The displayed magitudes and vector arrows showed the expected sort of patterns. However, when I checkd the ranges of values the magnitudes were up in the 1000s of metres per second :-(

In case something was wrong with the files I'd downloaded, I switched to the example ECMWF_ERA-40_subset.nc files mentioned in this blog post: https://www.lutraconsulting.co.uk/blog/2018/10/18/mdal/. Again the 10m windspeeds ranged way up into the 10,000s of metres per second. I tested the same file in UniData's IDV and it generated displays indicating much more reasonable ranges of a few 10s of metres per second.

I did a bit of digging and it looks like the values that the MDAL layer is showing are probably just the raw short integers from the NetCDF files. Any associated scaling factors and offsets necessary to convert those back into the original floating point values currently don't seem to be getting picked up from the NetCDF flies and applied to the integer numbers.

This extract from running the ncdump utility on the ECMWF_ERA-40_subset.nc file shows the scale_factors and offsets for the two variables used to store the 10m u and v wind components:

    short p10u(time, latitude, longitude) ;
            p10u:scale_factor = 0.0007584155104299 ;
            p10u:add_offset = -0.440509086897149 ;
            p10u:_FillValue = -32767s ;
            p10u:missing_value = -32767s ;
            p10u:units = "m s**-1" ;
            p10u:long_name = "10 metre U wind component" ;
    short p10v(time, latitude, longitude) ;
            p10v:scale_factor = 0.000664359461014752 ;
            p10v:add_offset = -0.745888358484452 ;
            p10v:_FillValue = -32767s ;
            p10v:missing_value = -32767s ;
            p10v:units = "m s**-1" ;
            p10v:long_name = "10 metre V wind component" ;

Hopefullly either I've just missed something or this will just be quite a quick fix. Happy to supply more info if needed as having MDAL in QGIS is potentially very useful for a few of us in my Unit.

cheers

Clint

PeterPetrik commented 5 years ago

Hi, these data are just taken from GDAL, see https://github.com/lutraconsulting/MDAL/blob/master/mdal/frmts/mdal_gdal.cpp#L333

can you load the file as raster layer and check in QGIS if the scale factor is applied there?

PeterPetrik commented 5 years ago

ad R: https://github.com/lutraconsulting/MDAL/issues/42

ClintBlight commented 5 years ago

Sorry, I should have mentioned in original report that values when read in as rasters had matched what one would expect.

QGIS_ECMWF_UV_MDAL_diff_to_raster

ClintBlight commented 5 years ago

Just in case it was something odd to do with a two element group like u+v making wind group I've just tried pretty much the same thing with just the first timestep of the mean sea level pressure (msl) field from that same ECMWF file. It is also stored in the file as integers with associated scaling factor and offset.

ncdump shows:

    short msl(time, latitude, longitude) ;
            msl:scale_factor = 0.1721754257462 ;
            msl:add_offset = 99424.2653245743 ;
            msl:_FillValue = -32767s ;
            msl:missing_value = -32767s ;
            msl:units = "Pa" ;
            msl:long_name = "Mean sea level pressure" ;

The resutls in QGIS were similar:

Attached screenshots try to show this. The one with the different colourscale is the raster layer representation in which I think the values are correct. Raster_MSL_Vals_OK Raster_MSL_unscaled MDAL_MSL_Vals

ClintBlight commented 5 years ago

Just had a look at some of the pages in the GDAL API and found some notes in the GetOffset and GetScale sections of the page for GDALRasterBand

GetOffset https://www.gdal.org/classGDALRasterBand.html#a11e59146b02d52127c8427cccc37683d

Fetch the raster value offset.

This value (in combination with the GetScale() value) can be used to transform raw pixel values into the units returned by GetUnitType(). For example this might be used to store elevations in GUInt16 bands with a precision of 0.1, and starting from -100.

Units value = (raw value * scale) + offset

Note that applying scale and offset is of the responsibility of the user, and is not done by methods such as RasterIO() or ReadBlock().

For file formats that don't know this intrinsically a value of zero is returned.

GetScale https://www.gdal.org/classGDALRasterBand.html#a4468122d33adb8978c242a0d6bdc5f0f

Fetch the raster value scale.

This value (in combination with the GetOffset() value) can be used to transform raw pixel values into the units returned by GetUnitType(). For example this might be used to store elevations in GUInt16 bands with a precision of 0.1, and starting from -100.

Units value = (raw value * scale) + offset

Note that applying scale and offset is of the responsibility of the user, and is not done by methods such as RasterIO() or ReadBlock(). For file formats that don't know this intrinsically a value of one is returned.

My C++ skills are pretty minimal but quickly scanned through that .cpp you gave a link to and the other one that looked potentially relevant for this type of file.

I didn't spot any obvious calls to GetOffset or GetScale. So at a guess, hopefully all that might be needed for a fairly generic fix might be a few extra lines to which for each band would use those functions to get and thne apply any offset and scale values in the data files (or the default 0 and 1 for formats that don't use them).

nyalldawson commented 5 years ago

For reference - this is where it's done in the qgs raster data provider: https://github.com/qgis/QGIS/blob/master/src/providers/gdal/qgsgdalprovider.cpp#L665 (also similar for the various raster stats methods)

PeterPetrik commented 5 years ago

patch supplied:

Screenshot 2019-04-08 at 14 28 56
ClintBlight commented 5 years ago

Sorry not to follow up on this till now. Many thanks for sorting out a fix so quickly.

After a bit of fiddling around I eventually managed to get a 3.7 development version of QGIS building on my Windows machine with the new MDAL code. I gave it a quick test by importing the same ECMWF mean sea level pressure field and this time the range of values reported for the mesh version looks much better (as shown below).

I see from some of the posts today that the patch might even make it into the QGIS 3.6.2 release. If so that'll be great. Once it's available, all I'll need to do is tell the colleague working with some of these ECMWF datasets to update their QGIS installation.

cheers

Clint

Fixed_Scaling_QGIS_Mesh_GDAL

PeterPetrik commented 5 years ago

Thanks, merged the fix to 3.6.2 so you will get it tomorrow :)