pangeo-data / education-material

An organizational meta-repo with pointers to all of the myriad educational materials available today (in any form)
MIT License
32 stars 8 forks source link

Gateway to Pangeo: Modules for Tutorials #15

Open amanda-tan opened 4 years ago

amanda-tan commented 4 years ago

I have been thinking about how best to create a Introduction to Pangeo "course" as we have been getting lots of questions from users who are unsure even about the fundamentals of docker containers and Jupyterhub, which one would need to understand to some degree in order to get packages working and to understand what can and cannot be realistically achieved by using the Pangeo platform.

I would like to solicit feedback on how best to market Pangeo to beginners, how to introduce users to the fundamental components that make up Pangeo and to help them envision a workflow that they can port to their own work. There a many things with using Pangeo that might be lost on an entirely new user (such as installing their own conda packages). I feel that it might be best to design such a "course" in more of a niche setting - for example, if we want to design the course in the context of working with NASA datasets and AWS, the modules might look something like this:

Would something like that be helpful, encourage more uptake and interest in Pangeo and is there a need to develop new notebooks/materials or are there existing notebooks/materials that can be recycled in this context?

rabernat commented 4 years ago

Amanda, this is a great idea! A potential user basically just tweeted a plea for this: https://twitter.com/MorganEONeill/status/1225613102483304448

I need a roadmap and/or nested hierarchy of these things and/or literally just a link to the right tutorial/ppt/walk-through of all these different nifty resources...

It reminds me a bit of these great "developer roadmaps", .e.g. https://roadmap.sh/frontend

frontend roadmap

Maybe we should try to create such a diagram for Pangeo.

jbednar commented 4 years ago

We made a stab at something sort of like that for the HoloViz tools:

https://pyviz-dev.github.io/holoviz/background.html#the-holoviz-ecosystem (src: https://github.com/holoviz/holoviz/blob/master/doc/flowcharts/holoviz.mermaid.txt)

guillaumeeb commented 4 years ago

Making such a roadmap is a great idea. It would help to identify course/modules needed to get started with Pangeo. And I'm sure a big part of these introduction modules are already existing somewhere!

robfatland commented 4 years ago

I think this (ATan's) is a fantastic idea as well; and I think now that OSM is done we'll discuss this further at eScience this week; so please send more input and / or request a call-in if you are so inclined. I think Amanda's plan and the related comments here sharpen an idea that came up with Scott of having a (geographically distributed) hack day in March where Pangeo educators ("POETs") who can participate take time out to dig into their favorite Pangeo education topic.

robfatland commented 4 years ago

I propose a 3-week sprint at 10 am PDT March 17, 24, 31 to follow up. Details to follow; I just use this and a slack remark and a discourse remark to 'mark the date'. We'll probably use zoom as UW recently got a global license and it handles large groups gracefully.

cgentemann commented 4 years ago

Sounds great!

ktyle commented 4 years ago

I wanted to take this opportunity to place here an outline of the upcoming course that leverages the Pangeo ecosystem that I’ll be debuting at UAlbany come Fall 2020.

The class, provisionally entitled "Advanced Meteorological Data Analysis and Visualization" will be open to senior-year undergraduates, as well as all graduate students, and its content will be delivered via an "online" course. In the 13 or so weeks I will have available, I currently have about 15 units (will cut back to 13 as I continue to prepare the course) that include the following:

  1. Reproducibility / Binder / Docker / Singularity
  2. Git
  3. Cloud computing
  4. Core Python Language: Python Fundamentals, Functions & Classes, Organization & Packaging (consider having this as a pre-req / review document)
  5. Scientific Python Fundamentals: Numpy and Matplotlib
  6. GIS / Cartopy
  7. GiS / QGIS / Rasterio / Folium Check out Cloud-optimized GeoTIFF (https://www.cogeo.org/)
  8. Databases
  9. Pandas
  10. Metpy / Siphon
  11. Xarray
  12. Xarray / Zarr (datasets: ERA5, CFSR, CMIP6)
  13. Interactive visualizations: ipywidgets/ Bokeh / Geoviews
  14. Interactive visualizations: ipyleaflet / Hvplot
  15. Dask

I plan to construct the course content in a manner consistent with a Carpentries-developed course, with Jupyter notebooks presenting both the software and the documentation. The content will be inspired by and hopes to build and extend on these three resources:

  1. Ryan Abernathey’s Research Computing class at Columbia: https://earth-env-data-science.github.io/intro
  2. Brian Rose’s Climate physics / modeling class at UAlbany: https://brian-rose.github.io/ClimateLaboratoryBook
  3. Damien Irving’s Data Carpentry lesson for AOS: https://carpentrieslab.github.io/python-aos-lesson/

I envision that this class would be a logical outgrowth /extension of Damien’s lessons, but for a semester-long class rather than a one-day Carpentries session. Ample time for a really deep and wide dive!

Ideally, many of these notebooks that will be developed as part of the class could become part of Pangeo's gallery of content/notebooks. My plan is also to obtain an education grant from one of the commercial cloud providers such that much of the work will be done in the cloud.

Students would be expected to contribute to FOSS developed for this course (as well as allied with it) and will also have a final project that demonstrates reproducibility ... (following Irving, 2016: https://doi.org/10.1175/BAMS-D-15-00010.1 ).

Ultimately, this class, given its online nature, could be taken by anyone, regardless of UAlbany matriculation status, and the intent is to make its core content available to anyone via The Carpentries.

Feedback is welcome ... Github link will be forthcoming over the next couple of months.

kmpaul commented 4 years ago

Having given a number of tutorials to NCAR staff in the last year or two, one thing we've definitely noticed is that there are many attendees who don't even have a good footing in Python. We've (@jukent, @andersy005, et al.) been taking it on ourselves to build a "base" (or Python newbie) layer into our tutorial material to get people up to speed on Python before they jump into even the most basic Python tools. Our staff usually can be trusted to know "a programming language" and even be familiar with "scientific programming" in that language, but learning and building comfort with Python seems challenging for some of them.

Has any consideration been given to what the "introduction for people who don't know Python" should look like? I'm aware of a number of different "Python for Programmers" materials (some in physical book form), but after quite a bit of experience with giving tutorials to existing, experience scientists who "just" don't know Python, I'm not sure those materials are the best. Perhaps you all know of something better?