ecmwf / climetlab

Python package for easy access to weather and climate data
Apache License 2.0
374 stars 58 forks source link

Dataset registering change ? #34

Closed jodemaey closed 2 years ago

jodemaey commented 2 years ago

Is it possible that with version 0.10.4, the way climetlab registers the local datasets has changed? I explain: I am using it on a jupyter hub and sharing conda environment amongst users with nb_conda_kernels. I have the Eumetnet plugin installed but a local editable version in an environment that is shared with the other users. As the admin, I can use my own environment and climetlab find the local plugin/dataset, but the other users can't and for them it tries to connect to github to find the list of plugin available but don't find the Eumetnet one because it is not yet on this list, so this result in an error.

This was not the case with version 0.9.8, so I wonder if something changed? That could also be an error of mine dealing with the environments, but if you could confirm or infirm that would be great.

Thank you in advance,

Jonathan

floriankrb commented 2 years ago

Yes, things have changed since version 0.9.8 and we are not yet ensuring backward-compatibility for all features.

As I understand, you and your users are using the same version for climetlab (0.10.4), right?

but the other users can't and for them it tries to connect to github to find the list of plugin available but don't find the Eumetnet one because it is not yet on this list, I am not sure to understand the details: "the other users can't" : do they have an error message? "it tries to connect to github": which address? any message in the logs? "but don't find the Eumetnet one because it is not yet on this list": which list? any message in the logs?

I would think that cleaning your cache and/or theirs should solve the issue.

jodemaey commented 2 years ago

Hi Florian,

Super thanks for this. Alas, removing the cache of all the users did not solve the problem.

I get errors AND warnings. Here is the error output on the side of the users (I have no problem myself):

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [4], in <module>
----> 1 ds = cml.load_dataset(
      2     "eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface",
      3     date="2017-12-02",
      4     parameter="2t",
      5     kind="highres",
      6 )

File /opt/tljh/user/envs/climetlab/lib/python3.9/site-packages/climetlab/datasets/__init__.py:253, in load_dataset(name, *args, **kwargs)
    240 def load_dataset(name: str, *args, **kwargs) -> Dataset:
    241     """Loads a dataset.
    242 
    243     Parameters
   (...)
    251         The loaded dataset.
    252     """
--> 253     klass = get_dataset.lookup(name)
    255     if name not in TERMS_OF_USE_SHOWN:
    256         if klass.terms_of_use is not None:

File /opt/tljh/user/envs/climetlab/lib/python3.9/site-packages/climetlab/datasets/__init__.py:221, in DatasetMaker.lookup(self, name)
    218 def lookup(self, name):
    220     loader = DatasetLoader()
--> 221     klass = find_plugin(os.path.dirname(__file__), name, loader)
    223     return klass

File /opt/tljh/user/envs/climetlab/lib/python3.9/site-packages/climetlab/core/plugins.py:168, in find_plugin(directories, name, loader)
    160     LOG.warning(
    161         "Cannot find %s '%s', did you mean '%s'?",
    162         loader.kind,
    163         name,
    164         correction,
    165     )
    167 candidates = ", ".join(sorted(c for c in candidates if "-" in c))
--> 168 raise NameError(f"Cannot find {loader.kind} '{name}' (values are: {candidates})")

NameError: Cannot find dataset 'eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface' (values are: era5-precipitations, era5-single-levels, era5-temperature, example-dataset, high-low, hurricane-database, meteonet-radar-rainfall, meteonet-samples-ground-stations, meteonet-samples-masks, meteonet-samples-radar, meteonet-samples-weather-models, sample-bufr-data, sample-grib-data, weather-bench)

and this is the list I am talking about. Now here is the warning:

HEAD https://github.com/ecmwf-lab/climetlab-datasets/raw/main/datasets/eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface.yaml
Traceback (most recent call last):
  File "/opt/tljh/user/envs/climetlab/lib/python3.9/site-packages/multiurl/http.py", line 77, in headers
    r.raise_for_status()
  File "/opt/tljh/user/envs/climetlab/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://github.com/ecmwf-lab/climetlab-datasets/raw/main/datasets/eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface.yaml
URL https://github.com/ecmwf-lab/climetlab-datasets/raw/main/datasets/eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface.yaml: 

<html>
[... lot of html]
</html>

Cannot find dataset 'eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', did you mean 'meteonet-radar-rainfall'?

Thank you in advance.

floriankrb commented 2 years ago

As I understand:

"Plugin installed correctly" means two things:

First, the plugin needs to be importable. I would expect that you can do

  import climetlab_eumetnet_postprocessing_benchmark
  print(climetlab_eumetnet_postprocessing_benchmark.__file__)

but they may not be able to do it. Could you check this?

Second, the entrypoints python plugin mechanism: doing pip install triggers the installation of the plugin using entrypoints. You may want to check that everything is fine in this respect. Perhaps uninstalling the plugin and reinstalling it would help, but I am not sure how nb_conda_kernels works regarding python plugins using entrypoints, if there is caching, etc.

jodemaey commented 2 years ago

"Plugin installed correctly" means two things:

First, the plugin needs to be importable. I would expect that you can do

import climetlab_eumetnet_postprocessing_benchmark print(climetlab_eumetnet_postprocessing_benchmark.file)

but they may not be able to do it. Could you check this?

No indeed they can't. I, as the admin who installed the env, can do it. So that is a first problem, maybe because I installed this locally, with pip install -e command.

I have tried doing pip install -e again and even reinstalling the full environment.

jodemaey commented 2 years ago

So it is probably not a climetlab issue, I thus close this issue.

jodemaey commented 2 years ago

Ok I finally I solved the problem, it was a pip conflict I think, so I did a lot of reinstall and it worked. Sorry for this. Thanks.