abby-baskind / seniorthesis

0 stars 2 forks source link

variables MIA #21

Open abby-baskind opened 2 years ago

abby-baskind commented 2 years ago

Hi @jbusecke

I hope you're super swell. Now that I've finished OSM 🥳🎉, I'm moving onto other parts of the analysis I started awhile back and haven't finished. When I was working on that and looking for the variables I need for a bit (e.g. checking Oyr instead of Omon), I couldn't find some of them. I'll list them below with the specific models I need. Let me know if you know if there's another way to find them.

siconc (which is usually in SImon rather than Omon)


tauuo. likely tauvo too but I haven't checked that one yet, presumably the models without tauuo likely don't have tauvo either

vo or vmo (surprisingly all the gr models I used and I think I checked if they had gn versions)

I think those are the only ones I've been missing so far. @gmacgilchrist if there are any others I missed (I do have all of the heat flux) drop them below :)

Thanks friends!

jbusecke commented 2 years ago

Hi Abby, thanks so much for listing these. Just a word of warning that I am quite busy these days, but we are actively looking into how to maintain the cloud catalog in the future, and we have a somewhat functional script to automatically download the netcdf data and put it into the cloud as zarr (not in production yet though). If you read through that issue, you notice that we base everything around single dataset_ids/instance_ids (instance is just the dataset_id with added version. In the future we will hopefully accept requests for datasets, but these will basically have to be in the form of a dataset_id (activityid.institute_id.source_id.experiment_id.variant_label.table_id.variable_id.grid_label.version).

What would be really helpful from your end, would be a complete list of these ids. I propose you search for the variables you like to have using the ESGF CMIP6 search.

You can click and unclick one or several 'facets' (the search items like table_id or experiment_id) and search until you find the datasets you need:

Screen Shot 2022-02-16 at 10 05 09

If you do not find them here, we are out of luck! Then they havent been added to the archive and you would have to contact the modelling centers directly.

If a dataset exists here and not in in the cloud, click on the show metadata tab, and note the instance_id

Screen Shot 2022-02-16 at 10 05 46

Once you have a complete list of those instance_ids for all the datasets, I can use them as test cases for our script, which will hopefully get them on the cloud sometime soonish.

Does that work for you?

abby-baskind commented 2 years ago

Hi @jbusecke !

I went through the database and checked for the variables I need that aren't already on Pangeo. Here are the instance IDs:

Thanks for all your help! Let me know if there's anything else I need to do, or if there is a step I missed.

jbusecke commented 2 years ago

These are great. I will not have time this week, but will try to use these as training datasets for my github actions (which will have the benefit of getting these online for you 😜). Feel free to ping this thread if nothing happens in the next few weeks.

gmacgilchrist commented 2 years ago

@jbusecke Abby's thesis is actually due in a few weeks. Not the end of the world if we don't have these variables - Abby has plenty of work in her thesis already - but if it were possible to get to some of them next week, that would be valuable.

jbusecke commented 2 years ago

Oh I did not realize the urgency of this. Apologies. I can try to set up something hacky if I have some time on Sunday.

jbusecke commented 2 years ago

Hi @abby-baskind, I just wanted to apologize that I did not get this done in time. And thank you again for using this (still experimental) way of working with CMIP data. I would also be very curious to read your thesis to get an idea of what you have accomplished.

I just wanted to check in if you are still interested in these data? Or if we should close this.