csdms / bmi-wavewatch3

Fetch WaveWatch3 data
MIT License
6 stars 0 forks source link

Errors on Windows: eccodes doesn't seem to work #17

Open leonardolombardi07 opened 1 year ago

leonardolombardi07 commented 1 year ago

Hello! First of all thanks a lot for open-sourcing this library - the source code is very well written and the documentation looks great. Thanks.

I'm in charge of facilitating access to WW3 data for my team, which is exactly what this library does. However, I'm getting errors when I try to use it on Windows (eccodes doesn't seem to work on Windows).

Do you have any tips on how this could be resolved? Maybe use some library other than cfgrib? I'm not very experienced with Python, so maybe it's possible to solve the problem with virtual environments or something?

Any advice would be helpful. Thank you very much for your attention.

mcflugen commented 1 year ago

@leonardolombardi07 Thank you for your interest in bmi-wavewatch2! I'm happy to hear it may be of use to you and your team.

Could you please pass along a minimal code snippet that reproduces the problem you are having?

leonardolombardi07 commented 1 year ago

Thanks for the answer. Below I send a simple code snippet that was executed in Visual Studio Code and in a Jupyther Notebook.

from bmi_wavewatch3 import WaveWatch3
import xarray as xr

ww3 = WaveWatch3("2009-11-08")
print(ww3.data)
xr.open_mfdataset(
    [ww3._cache / url.filename for url in ww3._urls],
    engine="cfgrib",
    parallel=True,
)

If I don't comment out the second line ("print(ww3.data)"), I get the error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

And if I simply try to get the data with the xarray I get the following error:

ECCODES ERROR   :  grib_handle_create: cannot create handle, no definitions found
ecCodes assertion failed: `h' in D:\a\ecmwflibs\ecmwflibs\src\eccodes\src\grib_query.c:572

It seems that the fetch (ww3.fetch data()) and the cache are working fine, the problem is really in the conversion of data from grib2 to xarray data. I tried to run with "parallel=False", I installed different versions of cfgrib and other things. But, as I said in the first message, I believe the problem is that the cfgrib package uses eccodes, which does not seem to be compatible with Windows. I've looked it up, but it looks like there's no package capable of handling grib files on Windows.

mcflugen commented 1 year ago

@leonardolombardi07 I've had some time to look at your issue and the problem is, as you say, with eccodes on Windows. I'm working on this in #18 but thought I would give you an update before I merge it in case you want to get started on this.

I assume that you have installed bmi-wavewatch3 from PyPI using pip, is that correct? It appears that the issue is only with the eccodes that's offered through PyPI so, if you are able, I suggest installing bmi-wavewatch3 through conda. It appears that the version of eccodes on conda-forge works with Windows.

Have you worked with the Anaconda Python distribution before? We tend to prefer it as it has a lot of packages for scientific programming that aren't available on PyPI (which, I think, is the reason we hadn't encountered your problem).

leonardolombardi07 commented 1 year ago

This is awesome! I was indeed using pip. I tried using Anaconda, but also got some unexpected problems to download packages with conda-forge and so I gave up real quick. I'll try them again.

Thank you very very much!

mcflugen commented 1 year ago

Yes, please give it a try again and report back. Make sure you install it into a new environment.

I'm not certain but I think the error you were getting from the following,

ww3 = WaveWatch3("2009-11-08")
print(ww3.data)
xr.open_mfdataset(
    [ww3._cache / url.filename for url in ww3._urls],
    engine="cfgrib",
    parallel=True,
)

was because ww3.data opens the data dataset using xr.open_mfdataset and so, when you try to open it a second time, you get the RuntimeError.

I, unfortunately, don't have access to a Windows machine right now so it's difficult for me to debug.

leonardolombardi07 commented 1 year ago

Hi @mcflugen! Just want you to know that after some config I was able to make it work on Windows. Thank you very much.

PS: coming from a web development background, I'm very surprised at how difficult it is to deal with infrastructure and environments in Python. I need to learn more!