JuliaGeo / NetCDF.jl

NetCDF support for the julia programming language
http://juliageo.org/NetCDF.jl/
MIT License
115 stars 28 forks source link

NetCDF reads values from file wrongly. #107

Closed natgeo-wong closed 4 years ago

natgeo-wong commented 4 years ago

Hello,

I am reading a large NetCDF file (around 900+ MB) using NetCDF. However, what happens is that when I use the NetCDF package, I retrieve a negative value. This is not possible, because the data is all positive.

I am able to read the NetCDF file using the package NCDatasets, and with MATLAB.

When using NetCDF:

julia> a = ncread("/Volumes/CliNat-ERA/era5/SEA/tcwv/tmp/era5-SEA-tcwv-sfc-2017.nc","tcwv",[3,4,5],[1,1,1])
1×1×1 Array{Int16,3}:
[:, :, 1] =
 -20740

When using NCDatasets

julia> ds = Dataset("/Volumes/CliNat-ERA/era5/SEA/tcwv/tmp/era5-SEA-tcwv-sfc-2017.nc")
julia> tcwv = ds["tcwv"][3,4,5]
18.572314914331237

And when using MATLAB

ncread("/Volumes/CliNat-ERA/era5/SEA/tcwv/tmp/era5-SEA-tcwv-sfc-2017.nc","tcwv",[3,4,5],[1,1,1])
ans = 
    18.5723

is there any sensing on what the issue might be? NCDatasets returns the following

julia> ds = Dataset("/Volumes/CliNat-ERA/era5/SEA/tcwv/tmp/era5-SEA-tcwv-sfc-2017.nc")
Dataset: /Volumes/CliNat-ERA/era5/SEA/tcwv/tmp/era5-SEA-tcwv-sfc-2017.nc
Group: /

Dimensions
   longitude = 301
   latitude = 141
   time = 8760

Variables
  longitude   (301)
    Datatype:    Float32
    Dimensions:  longitude
    Attributes:
     units                = degrees_east
     long_name            = longitude

  latitude   (141)
    Datatype:    Float32
    Dimensions:  latitude
    Attributes:
     units                = degrees_north
     long_name            = latitude

  time   (8760)
    Datatype:    Int32
    Dimensions:  time
    Attributes:
     units                = hours since 1900-01-01 00:00:00.0
     long_name            = time
     calendar             = gregorian

  tcwv   (301 × 141 × 8760)
    Datatype:    Int16
    Dimensions:  longitude × latitude × time
    Attributes:
     scale_factor         = 0.0013398768625944388
     add_offset           = 46.3613610445399
     _FillValue           = -32767
     missing_value        = -32767
     units                = kg m**-2
     long_name            = Total column water vapour
     standard_name        = lwe_thickness_of_atmosphere_mass_content_of_water_vapor

Global attributes
  Conventions          = CF-1.6
  history              = 2019-04-15 07:16:09 GMT by grib_to_netcdf-2.10.0: /opt/ecmwf/eccodes/bin/grib_to_netcdf -o /cache/data0/adaptor.mars.internal-1555311761.772098-29751-1-e896a99a-0eaf-4622-b36a-eb9407eab000.nc /cache/tmp/e896a99a-0eaf-4622-b36a-eb9407eab000-adaptor.mars.internal-1555311761.7736168-29751-1-tmp.grib
natgeo-wong commented 4 years ago

Actually, I think I might have found the issue, looking now at the Dataset output.

The file was actually saved as Int16 and automatically converted by the attributes into Float upon reading.

Which is actually pretty smart. But apparently NetCDF doesn't automatically do the conversion so I'd have to do it manually.

visr commented 4 years ago

Yes indeed that should be it. Going to close it as a duplicate of #39. It would be nice to at least have an option to apply these automatically, but for now you'd have to take the attribute values and apply them yourself.