ecmwf-lab / ai-models

Apache License 2.0
406 stars 68 forks source link

WrongStepError while trying to save the output in GRIB File #21

Closed fakemonk1 closed 10 months ago

fakemonk1 commented 11 months ago

Hi,

I am trying to run the model on the GPU instance by using the following command:

ai-models --assets assets-graphcast --input cds --date 20231123 --time 600 --only-gpu graphcast

It is able to do the inference, but it fails while trying to save the output in the GBIB file. Mentioning the error log below.\ How can I fix this?

2023-12-15 07:56:41,285 INFO Doing full rollout prediction in JAX: 3 minutes 3 seconds.
<class 'xarray.core.dataset.Dataset'>
2023-12-15 07:56:41,285 INFO Converting output xarray to GRIB and saving
ECCODES ERROR   :  endStep < startStep (6 < 11)
2023-12-15 07:56:44,026 ERROR Error setting step=6
2023-12-15 07:56:44,026 INFO Saving output data: 2 seconds.
2023-12-15 07:56:44,026 INFO Total time: 3 minutes 32 seconds.

Traceback (most recent call last):
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/climetlab/readers/grib/codes.py", line 243, in set
    return eccodes.codes_set(self.handle, name, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/gribapi/gribapi.py", line 2121, in grib_set
    grib_set_long(msgid, key, value)
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/gribapi/gribapi.py", line 993, in grib_set_long
    GRIB_CHECK(lib.grib_set_long(h, key.encode(ENC), value))
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/gribapi/gribapi.py", line 226, in GRIB_CHECK
    errors.raise_grib_error(errid)
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/gribapi/errors.py", line 381, in raise_grib_error
    raise ERROR_MAP[errid](errid)
gribapi.errors.WrongStepError: Unable to set step

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/azureuser/anaconda3/bin/ai-models", line 33, in <module>
    sys.exit(load_entry_point('ai-models', 'console_scripts', 'ai-models')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/azureuser/ai-models/ai_models/__main__.py", line 291, in main
    _main()
  File "/home/azureuser/ai-models/ai_models/__main__.py", line 264, in _main
    model.run()
  File "/home/azureuser/ai-models-graphcast/ai_models_graphcast/model.py", line 262, in run
    save_output_xarray(
  File "/home/azureuser/ai-models-graphcast/ai_models_graphcast/output.py", line 60, in save_output_xarray
    write(
  File "/home/azureuser/ai-models/ai_models/model.py", line 104, in write
    self.output.write(*args, **kwargs),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/azureuser/ai-models/ai_models/outputs/__init__.py", line 36, in write
    return self.output.write(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/climetlab/readers/grib/output.py", line 141, in write
    handle.set(k, v)
  File "/home/azureuser/anaconda3/lib/python3.11/site-packages/climetlab/readers/grib/codes.py", line 246, in set
    raise ValueError("Error setting %s=%s (%s)" % (name, value, e))
ValueError: Error setting step=6 (Unable to set step)
fakemonk1 commented 11 months ago

@b8raoult would appreciate it if you could help me here!

floriankrb commented 11 months ago

could you add the versions of the packages you are using, please ? (pip freeze, or with conda)

fakemonk1 commented 11 months ago

Hey Florian, Thanks for your response. Attached are the packages which I am using on the vm. Are you interested in some library in particular?

pip_list.txt conda_list.txt

floriankrb commented 11 months ago

yes, especially the versions of climetlab and ai-models. There were some changes recently related to writing into gribs (https://github.com/ecmwf/climetlab/commit/867967ca395a5ae7bca5817c9c5b7df115a4a5f3), I am not sure it is related. But what about updating those packages to the latest ones (0.19.1 and 0.3.1)?

chomutovskij commented 11 months ago

@floriankrb I am facing the same error - I upgraded all the dependencies to latest (please find pip_list attached) but that didn't resolve the issue.

pip_list_andrej.txt

fakemonk1 commented 10 months ago

@floriankrb @b8raoult The package versions of climetlab and ai-models are 0.19.1 and 0.3.1 respectively. Could you please guide us what could be the issue and how we can resolve this?

b8raoult commented 10 months ago

This should be fixed in version 0.0.5 of the plugin.

chomutovskij commented 10 months ago

Confirming, it's fixed, thanks a lot @b8raoult

fakemonk1 commented 10 months ago

Thanks @b8raoult for fixing this!