FESOM / fdiag

FESOM2 monitoring
1 stars 1 forks source link

Openifs radiation imbalance #11

Closed JanStreffing closed 2 years ago

JanStreffing commented 2 years ago

First openifs plot routine. I had some trouble putting in in a subfolder. That's probably something that would take me 15 minutes and @koldunovn 15s. Maybe you can rectify this. I found that for loading and makeing field means, the fastest option in to use the cdo pipe into numpy array feature. This adds a cdo dependency. I'm can load 9.7 GB of data and calculate the fldmean of the 880 years of monthly surface data in under 20s per field. -> Total runtime 120s

The solution I wrote based on xarray mf_open and loading the entire data into python takes ~215s to load the data-> Total runtime ~21min.

I think the extra dependency is justified by the speedup.

koldunovn commented 2 years ago

It's not like I am totally against new dependencies, but my experience show that cdo never beats xarray :) I tried to do mean field with the data you have in the workflow, and it took me about 7 seconds per field, plus some time for opening the data (which take 15 sec first time, but then cached somehow). Please have a look here /p/home/jusers/koldunov1/juwels/PYTHON/JAN/test_xarray.ipynb. If you working with some different data, let me know, I will try to test.

In general any dependencies in the notebooks currently not a problem, but we will have to to revisit them at some point, when the package is more mature and covered with at least some tests :)

JanStreffing commented 2 years ago

The security setting of Juwels mean that we can not access each others home directories. Could you share the example script via the In your sub-folder inside /p/project/chhb19?

In the example I'm only loading 40 years. I can do that in < 1s in both versions with cdo and xarray. However when I load exps = range(1, 45) it's a totally different story. Xarray seems to get exponentially slower the more data i read in, meanwhile time usage for the cdo base loading grows linearly.