JuliaGeo / GRIBDatasets.jl

A high level interface to GRIB encoded files.
MIT License
12 stars 5 forks source link

Issue with eccodes_jll v2.36.0+0 #35

Open Alexander-Barth opened 1 month ago

Alexander-Barth commented 1 month ago

I am investigating recent failures in the integration test (https://github.com/JuliaGeo/CommonDataModel.jl/actions/runs/10672550987). The error is reproducible, when I install the current main version of GRIBDatasets.jl (4f138719ec5c1443baa06c33aed221fd82ac3740) is an empty environment. I get the following test error:

    Testing Running tests...
redundant vertical dims: Error During Test at /home/abarth/.julia/dev/GRIBDatasets/test/dimensions.jl:72
  Got exception outside of a @test
  KeyError: key "gridType" not found
  Stacktrace:
    [1] getindex
      @ ./dict.jl:498 [inlined]
    [2] getone
      @ ~/.julia/dev/GRIBDatasets/src/index.jl:105 [inlined]
    [3] _horizontal_gridtype(index::FileIndex{Float64})
      @ GRIBDatasets ~/.julia/dev/GRIBDatasets/src/dimensions.jl:97
    [4] _alldims(index::FileIndex{Float64})
      @ GRIBDatasets ~/.julia/dev/GRIBDatasets/src/dimensions.jl:112
    [5] macro expansion
      @ ~/.julia/dev/GRIBDatasets/test/dimensions.jl:83 [inlined]
    [6] macro expansion
      @ /mnt/data1/abarth/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
    [7] macro expansion
      @ ~/.julia/dev/GRIBDatasets/test/dimensions.jl:73 [inlined]
    [8] macro expansion
      @ /mnt/data1/abarth/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
    [9] top-level scope
      @ ~/.julia/dev/GRIBDatasets/test/dimensions.jl:14
   [10] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [11] macro expansion
      @ ~/.julia/dev/GRIBDatasets/test/runtests.jl:30 [inlined]
   [12] macro expansion
      @ /mnt/data1/abarth/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [13] macro expansion
      @ ~/.julia/dev/GRIBDatasets/test/runtests.jl:30 [inlined]
   [14] macro expansion
      @ /mnt/data1/abarth/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [15] top-level scope
      @ ~/.julia/dev/GRIBDatasets/test/runtests.jl:28
   [16] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [17] top-level scope
      @ none:6
   [18] eval
      @ Core ./boot.jl:385 [inlined]
   [19] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:291
   [20] _start()
      @ Base ./client.jl:552
┌ Warning: The length of dimension valid_time in variable u is different
│                 from the corresponding dimension in the dataset. This could lead to unexpected
│                 behaviour.
└ @ GRIBDatasets ~/.julia/dev/GRIBDatasets/src/variables.jl:160
variable indexing with redundant level: Error During Test at /home/abarth/.julia/dev/GRIBDatasets/test/dataset.jl:120
  Got exception outside of a @test
  The key `u10` was been found in the dataset. Available keys: ["lon", "lat", "hybrid", "heightAboveGround", "heightAboveGround_2", "valid_time", "u", "v", "etadot", "t", "sp", "q", "lsp", "acpcp", "sshf", "iews", "inss", "ssr", "avg_sd_m", "avg_msl", "tcc", "avg_10u", "avg_10v", "mean2t", "avg_2d", "avg_z", "lsm", "sdor", "cvl", "cvh", "avg_fsr"]
  Stacktrace:
    [1] error(s::String)
      @ Base ./error.jl:35
    [2] Variable(ds::GRIBDataset{Float64, 6, Missing}, key::String)
      @ GRIBDatasets ~/.julia/dev/GRIBDatasets/src/variables.jl:174
    [3] cfvariable(ds::GRIBDataset{Float64, 6, Missing}, varname::String; maskingvalue::Missing)
[...]

With the following dependencies:

Status `/mnt/data1/abarth/.julia/environments/grib-released/Manifest.toml`
  [179af706] CFTime v0.1.3
  [1fbeeb36] CommonDataModel v0.3.6
  [34da2185] Compat v4.16.0
  [864edb3b] DataStructures v0.18.20
  [3c3547ce] DiskArrays v0.4.4
  [b16dfd50] GRIB v0.4.0
  [82be9cdb] GRIBDatasets v0.3.2 `~/.julia/dev/GRIBDatasets`
  [692b3bcd] JLLWrappers v1.6.0
  [8ac3fa9e] LRUCache v1.6.1
  [6fe1bfb0] OffsetArrays v1.14.1
  [bac558e1] OrderedCollections v1.6.3
  [21216c6a] Preferences v1.4.3
  [aacddb02] JpegTurbo_jll v3.0.3+0
⌅ [88015f11] LERC_jll v3.0.0+1
  [89763e89] Libtiff_jll v4.6.0+0
  [d3a379c0] LittleCMS_jll v2.16.0+0
  [643b3616] OpenJpeg_jll v2.5.2+0
  [ffd25f8a] XZ_jll v5.4.6+0
  [3161d3a3] Zstd_jll v1.5.6+0
  [b04048ba] eccodes_jll v2.36.0+0
  [477f73a3] libaec_jll v1.1.2+0
  [b53b4c65] libpng_jll v1.6.43+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [b77e0a4c] InteractiveUtils
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.0.5+1
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+2
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2

If I manually downgrade eccodes_jll to v2.28.0 (the previous released version), all tests pass.

Independently of the test suite, the issue can be reproduced with the sample file ENH18080914:

using GRIBDatasets:  GribFile
grib_path = "/home/abarth/.julia/dev/GRIBDatasets/test/sample-data/ENH18080914"
f = GribFile(grib_path)
m = collect(f)[end]
@show m["cfVarName"]
# output "avg_fsr" for eccodes_jll v2.36.0+0
# output "sr" for eccodes_jll v2.28.0+0

Is this a known problem? Thank you for your insights!

Alexander-Barth commented 2 days ago

Would a PR pinning the version of eccodes_jll (indirect dependency via GRIB.jl) be helpful?

Ref: https://github.com/weech/GRIB.jl/issues/19

lupemba commented 2 days ago

After a bit of digging. This page has a list of some of the changed variable names. https://confluence.ecmwf.int/display/MTG2US/Changes+in+ecCodes+version+2.36.0+compared+to+the+previous+version

It looks like, they don't considering changing the short names breaking. This is not good since GRIBDataset.jl clearly relies on the short names.

paramID Old shortName New shortName
235080 mrsn avg_rsn
235081 mcvl avg_cvl
235082 mcvh avg_cvh
235083 mci avg_ci
235084 msst avg_sst
235085 mlai_lv avg_lai_lv
235086 mlai_hv avg_lai_hv
235087 mtclw avg_tclw
235088 mtciw avg_tciw
235089 m2sh avg_2sh
235090 mlmlt avg_lmlt
235091 mlmld avg_lmld
235092 m2r avg_2r
235093 mfscov avg_fscov
235094 msot avg_sot
235131 mu avg_u
235132 mv avg_v
235134 msp avg_sp
235136 mtcw avg_tcw
235137 mtcwv avg_tcwv
235141 msd_m avg_sd_m
235151 mmsl avg_msl
235157 mr avg_r
235165 m10u avg_10u
235166 m10v avg_10v
235168 m2d avg_2d
235186 mlcc_frac avg_lcc_frac
235187 mmcc_frac avg_mcc_frac
235188 mhcc_frac avg_hcc_frac
235238 mtsn avg_tsn
235243 mfal_frac avg_fal_frac
235244 mfsr avg_fsr
235245 mflsr avg_flsr
tcarion commented 1 day ago

Hi!

Indeed, the error comes from the change in those short names. It's because the tests are still using the old variable name. Here's the error with a higher level test:

u10 = ds2["u10"]
ERROR: LoadError: The key `u10` was not found in the dataset. Available keys: ["lon", "lat", "hybrid", "heightAboveGround", "heightAboveGround_2", "valid_time", "u", "v", "etadot", "t", "sp", "q", "lsp", "acpcp", "sshf", "iews", "inss", "ssr", "avg_sd_m", "avg_msl", "tcc", "avg_10u", "avg_10v", "mean2t", "avg_2d", "avg_z", "lsm", "sdor", "cvl", "cvh", "avg_fsr"]

Which is much more explicit :-)

I'll fix that!

BTW, I'm sorry I'm not really active anymore, I changed my job and I don't really use Julia anymore (unfortunately)

lupemba commented 1 day ago

Hi @tcarion

BTW, I'm sorry I'm not really active anymore, I changed my job and I don't really use Julia anymore (unfortunately)

This is completely understandable. I really appreciate what you have done with the package so far. GRIBDataset.jl is useful for me in my work and I hope that I can contribute more to it in the future. That being said, I will probably not be able to do any significant contribution to the package the next 6 months but I hope I can do it after.

BTW, I have made a PR to pin eccodes_jll to make the variable names stable.

Alexander-Barth commented 23 hours ago

I would like to second @lupemba's appreciation of your work :-)

Maybe, it would be useful to look-up the variable by the CF standard names?

using GRIBDatasets
using CommonDataModel: @CF_str
ds = GRIBDataset("/home/abarth/.julia/dev/GRIBDatasets/test/sample-data/era5-levels-members.grib");
ds["isobaricInhPa"]
#isobaricInhPa (2)
#  Datatype:    Int64 (Int64)
#  Dimensions:  isobaricInhPa
#  Attributes:
#   units                = hPa
#   stored_direction     = decreasing
#   long_name            = pressure
#   axis                 = Z
#   standard_name        = air_pressure
#   positive             = down

ds["isobaricInhPa"][:] == ds[CF"air_pressure"][:]
# returns true

This assumes that there is only one variable with a given standard_name but it is often the case. The CF standard names are probably more stable than the eccodes, but it is maybe a bit annoying to use the CF prefix all the time.