ScottWales / swc-climatedata

Lessons for working with climate data
https://scottwales.github.io/swc-climatedata
Other
7 stars 2 forks source link

Comments #1

Open DamienIrving opened 7 years ago

DamienIrving commented 7 years ago

@ScottWales - Well done on putting all this together! I especially like the content on THREDDS - I bet most of the audience will be unaware of the fact that you can analyse data without having the original data file sitting on your machine. Apologies for the delay in getting back to you with some comments (all of which are just minor things):

Software installation

An alternative to

wget https://raw.githubusercontent.com/ScottWales/swc-climatedata/gh-pages/conda-env.yml
conda env create -f conda-env.yml

would be to upload conda-env.yml to your account at anaconda.org. People could then just type (assuming your anaconda user name is scottwales):

conda env create scottwales/conda-env

I talk about how to get environments up onto your anaconda.org page in this quick conda tutorial.

On the topic of conda environments, I think conda kapsel will be a bit of a game changer for reproducible research. It's kind of like using an environment.yml file except you just list your dependencies in your README file and kapsel comes along, parses the README and installs it all for you.

Profiling

In the lesson on optimising, it might be worth briefly touching on how one might profile their code to figure out where the optimisation needs to happen. I asked around the Software Carpentry community a while ago to find out what the most simple tools are for time and memory profiling, then wrote it up into a very short lesson.

xarray or iris?

This post might be a good place to point people if they're confused about whether they should use iris or xarray (or something else).

ScottWales commented 7 years ago

Thanks Damien,

We ran the session yesterday afternoon. The Opendap and Xarray content went well, calculating the NINO34 index from CMIP data loaded from Thredds was a good demo. It's probably worthwhile for me to flesh out the notes for that section so that it can be run again in the future.

The students were less attentive to the metadata & publishing section - I'm not as familiar myself with the content and it was less interactive.

I was hoping to use NCI's virtual desktops for the Iris and Dask content but the required software wasn't on the lab computers in time. I ended up not covering Iris and taking a brief look at using Dask to chunk Xarray datasets in a terminal session

Thanks for the Conda pointer, much easier than typing out a long url. A few people weren't able to install the environment - I think one of the dependencies isn't available for windows. Most students were using a Jupyterhub instance I set up as a contingency on the Nectar cloud which had all the libraries pre-installed. It was nice to have this as a backup when people had trouble installing the environment, and Jupyterhub itself was remarkably simple to set up on Nectar.

DamienIrving commented 7 years ago

Whoops, sorry! I thought the workshop was going to be at the Lorne workshop next week....

Glad to hear it went well.