Open gndaskalova opened 6 years ago
Reproducible environments - conda environments - introduce reproducible stuff at the start of each tutorial? If it's not a tutorial on its own. Jupyter notebooks - Jupyter Labs too since they are the future Physical modeling nympy 2 (integration methods) @dvalters Statistical modeling - intro, intermediate, advanced @dfulu Making maps - cartopy Ashley Timeseries tutorial - xarray Ashley Text analysis - e.g. with tweets @dfulu plotly & data visualisation @dfulu pandas advanced data vis @dvalters
So for Numerical Modelling with Python I think it would be good to split it into two parts, a 'simpler' lesson that is less mathematical using Cellular Automaton models (not much maths involved and easy to understand conceptually) and then a second tutorial (Part II) looking at a Finite Difference model in Python/Numpy (slightly more mathsy, but minimal). I would take the examples from my own work with flood models/weather models. So people can get a flavour with the first tuto and take it further in the second if they like.
Proposed dates for finishing the draft tutorials:
- [ ] Numerical Modelling with Python I: Building a flood model (Cellular Automaton model) -- Dec 30th
- [ ] Numerical Modelling with Python II: Building a weather 'forecasting' model (Finite Difference Model) -- Jan 30th '19 ?
If there's time, I'll do the extra Pandas II tutorial as well - I already have some material that I could use/extend for this. I'll update with a time for this later.
Here is my plan for the tutorials I'd like to make. All of them are machine learning type tutorials except for the Plotly / interactive data vis tutorial which is just general data vis skills.
Having made this list I realise how much I have taken on, but I'll be happy to keep working away until it is done. I am also not sure how optimistic I am being on time scales, especially towards the end of this list when I move to MCMC and gaussian processes. I guess we will have to see
Some of the items on this list could lead to a second tutorial at some point - definite case to be made for extended funding for coding club ;)
[x] Text analysis and the application of unsupervised machine learning to text. Primarily LDA and NMF, maybe k-means for an easy starting point. A good portion of this should be cleaning up text for modelling with also using the Tweepy tool to scrape tweet data.
Again this could be split into 2 tutorials depending on depth required.
First tutorial -- Dec 30
[ ] Unsupervised modelling tutorial. Using unsup for pattern discovery and exploring your data. Possibly including k-means, PCA, t-SNE and self-organising maps.
This may be multiple tutorials. I think it is too much for one. If it needs to be broken up the logical break point(s) would be something like ((intro, k-means), PCA) , ((t-SNE,), (SOM,))
First tutorial -- Jan 30
[ ] Supervised ML. Focusing on explainable models (aka no neural networks). Include simple linear/polynomial regression fitting and maybe random forest.
basics probably doable in one tutorial -- Jan 30
[ ] Making interactive plots with Plotly, IPython widgets and making animations
probably one tutorials but could come back to Plotly and IPython widgets in follow on tutorial
First tutorial -- Feb 30
[ ] MCMC fitting and parameter estimation. Calculating the uncertainty in your model parameters.
basics probably doable in one tutorial -- March 30
[ ] Gaussian Processes tutorial. Making the most from a small data source
basics probably doable in one tutorial depending on depth -- Apr 30
PS. Accidentally unassigned this issue. I have no idea how I did it or how to fix it @dvalters , @gndaskalova
I can work on two tutorials (titles TBC). I might also write up some recommendations about use of Jupyter & conda, aiming towards good workflows and reproducible science, but I'm not too sure about the best way to do this yet. If we instruct people to install certain packages etc, we should be consistent about how to do this - sometimes it might be appropriate for people to use isolated conda environments, for example.
[x] Time series analysis with pandas (& xarray) -- Dec 30
[ ] Visualisation and working with map projections with cartopy (& xarray) -- Feb?
My intention is to give instructions for working with pandas and cartopy independently and show xarray for some more advanced applications. I'm trying to identify some good examples to show with real data as I don't have something suitable yet.
I didn't include xarray in the time series tutorial, although I may update it with a small additional example / mention of xarray in the future. As for the cartopy tutorial, I'm not sure when I will have time to create it, so we can leave it to a workshop next semester.
I've been thinking about the numerical modelling workshops, and I think there may be better/more useful topics that can be covered for people based on feedback from some of my colleagues. (It is difficult to cover this kind of modelling in a short 2hr workshop without covering lots of maths...)
I will finish the geopandas one soon, but then I'm proposing the other 2 that I'm writing will be:
As these help with understanding concepts covered in the later tutorials
@dvalters FYI, I started something new on plotting an cartopy, with some reference to OOP: https://github.com/smithara/python_tutorials/blob/master/matplotlib_cartopy_subplots.ipynb
Note which package versions each tutorial uses.