Closed Ovewh closed 5 months ago
Thanks for doing this. Updated with other data description and removed the old file. Do you think we should add an overview into the index.rst for data section?
This is awsome! I wonder though if it could be simplified? Like, do they need to pre-process? And do they need the dask?
Secondly, I'm getting a lot of errors when I use this catalog that I didn't get with the pangeo catalog. Any ideas why? Right now it's giving me a lot of
but I also e.g. got this:
I can obviously upload the example if helpful.
Are you opening the whole catalog at once without .search()?
Like, do they need to pre-process?
I think it nice to see that there is an option for preprocessing the data. That can save you some work that would have to be done otherwise, by loops etc.
And do they need the dask?
Dask is also optional, but can make the calculation faster.
I agree it's very nice, but most of the students might want to just copy paste that code into their own notebook and tweak the search. That's why I would just do a super easy "read and plot" first and then add complexity after. Does it make sense?
Are you opening the whole catalog at once without .search()?
@mvdebolskiy BTW could you make a seperate folder for catalogs? Under /mnt/craas1-ns9989k-geo4992/data/catalogs
could you also copy over some other catalogs too. I have build one for the CESM-PPE and also one which merges pangeo and the local cmip6 catalog.
@Ovewh Sure, I can make one. Can you put all of them in your $HOME/catalogs and ping me?
Ok, so my error came because it is trying to merge from AERmon and Amon (or possibly the difference in vertical coordinate.
We tested a bit with @mvdebolskiy and if we specify:
cat.esmcat.aggregation_control.groupby_attrs = ['activity_id', 'experiment_id','source_id','table_id']
it works fine. Might need more separators as well? I am not sure. Also, the example you give Ove seems to separate for only activity_id and institution_id, which is maybe not ideal (in case you open different experiments in one e.g.).
@sarambl Ok, I'll add the same groupby attrs as pangeo uses
@Ovewh Sure, I can make one. Can you put all of them in your $HOME/catalogs and ping me?
I have put all the catalogs under fc-3auid-3a9fdc0c87-2d7836-2d4bdc-2db802-2d9a250c322e3b/catalogs @mvdebolskiy
@Ovewh, awesome. Will put them in data. Btw, change the dask tooltip, since it's easier to just click on the left sidebar in the jupyterlab.
I think it looks good for now, so I'll merge it into main
Added notebook example on how to use intake esm and intake catalog to browse available data.
I used the one called CMIP.json which only contain local CMIP6 data.