ecmwf / earthkit-data

A format-agnostic Python interface for geospatial data
Apache License 2.0
57 stars 17 forks source link

How do users get FDB access? #45

Open samsammurphy opened 1 year ago

samsammurphy commented 1 year ago

FDB access is required for fdb_stream.ipynb.

  1. Should we point people to docs on how to get access?
  2. How do I get access? 🙂
sandorkertesz commented 1 year ago

Only works at ECMWF and you need to build the environment for it. As far as I know this process is undocumented.

samsammurphy commented 1 year ago

I am not sure we should have a notebook example in an open source repo that requires an undocumented process to work. Should we consider removing fdb_stream.ipynb??

It could be that this warning is enough though.. Screenshot 2023-05-03 at 17 48 56

...so I am not sure how to proceed. Should I close this issue (#45)?

On a related note.. should I close this PR?

tlmquintino commented 1 year ago

no, the process is not undocumented. but it is not trivial either. See FDB and PyFDB repos. please leave the example here.

tlmquintino commented 1 year ago

also this is work in progress...

sandorkertesz commented 1 year ago

We definitely need this notebook. It is valuable for those who has fdb access at ECMWF. We just need to add a warning to the notebook stating that this example can only be used at ECMWF when pyfdb is set up for you.

tlmquintino commented 1 year ago

That is not correct. Please do not label this example as "only at ECMWF". FDB and pyfdb are open source software that other centres are starting to use. The solution here is to point to the open documentation in readthedocs for FDB and pyfdb. It is possible that the documentation is insufficient at the moment but that will be improved in the months to come as we work with the community that is growing around the use of FDB.

samsammurphy commented 1 year ago

I popped in to chat with Emanuele Danovaro. The readthedocs for FDB are currently this page on the C++ API with the PyFDB docs on the way...

tlmquintino commented 1 year ago

pyfdb docs:

https://pyfdb.readthedocs.io/en/latest/

Has examples but needs improving.

kinow commented 1 year ago

Hi, I found this issue while searching for pyfdb on GitHub. We are using FDB & pyfdb at the BSC, but also collaborating with researchers at Unito & UFZ that are also using FDB, GRIB, pyfdb, metkit, etc. All in different HPC environments, and everything is working so far. So FDB & pyfdb definitely work outside the ECMWF :slightly_smiling_face:

We have the module system with FDB & dependencies installed for us at these HPC's, but we now need to run some notebooks & pytest using pyfdb. I just finished building a container in Docker that will be tested with Singularity on our HPC for a containerized application that runs in a workflow and accesses FDB with pyfdb. This same container may be used in a GitHub Action to run the pytest tests we have in another Python module (not sure yet if that will be necessary), and as reference for others that need to install FDB/pyfdb on their local workstations.

Here's the Docker file (the test data & schema used are from pyfdb): https://github.com/kinow/docker/tree/b00331bf97c1e59fc07a5b838919faecda0a909a/fdb

To install FDB & dependencies, I looked at the FDB README.md file on GitHub, and found which dependencies I had to install (eccodes, eckit, metkit). Then I installed ecbuild & cmake in that container, and started installing each dependency, one after the other using ecbuild & cmake, while reading their README.md files.

I had one issue that was my fault for trying to mix these cmake-installed dependencies with a Conda package (eccodes). But installing eccodes with cmake solved that. Then there were a few system runtime dependencies missing for the container, and I started to slowly add one at a time to fix the errors - used this DWD python-eccodes container as reference: https://github.com/DeutscherWetterdienst/python-eccodes/blob/983a0bc6dcf9835fb1fd245ddc1c9486f8e0117d/Dockerfile

That Dockerfile contains the minimum I needed to get the fdb* commands to work, and also to run some basic test with pyfdb and another Python module installed with micromamba.

Just in case that helps others trying to set up FDB & pyfdb :+1: