UMCarpentries / Administration

Discussion of issues related to the running of the UM Software Carpentry group
MIT License
3 stars 2 forks source link

How to cater to specific audiences? A case of SNRE and EEB #22

Open marschmi opened 8 years ago

marschmi commented 8 years ago

Hi Everyone! @arthur-e and I are hoping to organize a SWC or DC event over October break for SNRE and EEB. However, how do we figure out how to cater to the audiences of our workshops?

@arthur-e was very proactive and sent a survey to the SNRE PhDs and Post Docs and came up with the following results:

From these results, @arthur-e provided 2 ideas:

  1. Teach the standard R-flavored Data Carpentry Ecology workshop, setting the expectation that participants should already have basic familiarity with R. The 4 modules included would be OpenRefine, Data Analysis in R (function application, dplyr), Visualization in R (ggplot2), and Intro to SQL.
  2. Teach a standard Software Carpentry workshop that is Python flavored (anticipating that, like at SNRE, few EEB students are familiar with Python). This would be Unix Shell + Intro to SQL + Intro to Python + Version Control with Git/ Mercurial.

For EEB specific ideas - does @duffymeg, @michberr, @bsmith89, @singhal, @clarashaw have ideas? Would EEBers would be interested in SQL? It might be best to have a 2nd lesson in python (SWC) or going deeper into R (DC). In addition, as the DC OpenRefine lesson is only ~90 minutes and the R lesson is geared towards novice R users, we might want to add a module on vegan (e.g. working with packages in R) or how to run models in R? But this puts more strain on instructors.

I'd love to hear ideas!

bsmith89 commented 8 years ago

While I would love to have more EEB/SNRE folks working in Python, I think an advanced R workshop will be more immediately useful, since learners are already using it in their work. I'd particularly like to see sections on modularizing code (functions and reusable scripts over monolithic analyses), "tidyverse" packages, Rmd, etc. Given extant resources, like the R bootcamp that Rabosky lab members put on, I think a Novice R lesson would be somewhat redundant.

The challenge, in that case, is finding/creating teaching material which won't lose the 10-30% of learners who are still getting the basics, while still providing useful information for the fraction that are already proficient. As you pointed out, this certainly puts more strain on instructors. We could and probably should limit an Intermediate R lesson to one half-day, rather than the traditional full day.

I think SQL would be a really fantastic addition for EEB/SNRE people. I've started using it extensively in my work, and I'm really happy about that. I think I would be able to teach that lesson, although I have not yet.

If, for some reason, the organizers decide they want Novice Python for this workshop, I'd be happy to teach it.

bsmith89 commented 8 years ago

If I'm teaching Novice Python I would want a full day, so it would have to be Python, Git, and Shell, dropping SQL.

arthur-e commented 8 years ago

Thanks for your comments, Byron (@bsmith89). How about this for an "advanced" R Intro? This is basically what I taught at the Federal Reserve Board last week. It starts out with a review of the basics, which I feel is in the spirit of SWC/DC and recognizes that a minority of attendees will be new to R. However, it culminates in advanced analysis with plyr, dplyr, and tidyr. The Gapminder dataset could be easily switched out for the Data Carpentry "Ecology" dataset.

Skeleton workshop: Day 1: Advanced Intro to R for Data Analysis, ??? Day 2: (Data Carpentry) Introduction to SQL, ???

pschloss commented 8 years ago

If 88% of people claim knowledge of R but want more information on programming in R, I would interpret that as meaning that they're good at chaining together functions to perhaps run a statistical test or generate a plot. Perhaps something that emphasizes functions, variables, DRY, testing would be useful to that group

bsmith89 commented 8 years ago

@arthur-e That material looks really fantastic! Did you teach it within a single half-day session?

michberr commented 8 years ago

To add in my cents,

Dplyr: I'm not sure what fraction of the 88% of people who already know R are familiar with dplyr. That said, it's a very intuitive syntax, so you could review the basic verbs like select, filter, group_by, and summarize in ~45 minutes. I think this group would really benefit from sections on the join verbs (because we're always joining different datasets together). I also think the tidyr verbs like spread and gather are quite useful and prime the audience for SQL syntax later.

Lists: In addition to functions, lists really help make your code more modular. I think lots of more novice R users don't have a good grasp of how and when to use a list as well as the list functions like lapply, sapply, and Map (basically the only 3 I use). These can really transform your code when you're trying to do exploratory analysis on a bunch of variables or when you're trying to generate 6 plots that are nearly identical.

I agree with Byron that SQL would be a great addition to the workshop and you could integrate it with the dplyr lesson.

On Tue, Aug 9, 2016 at 10:45 AM, Pat Schloss notifications@github.com wrote:

If 88% of people claim knowledge of R but want more information on programming in R, I would interpret that as meaning that they're good at chaining together functions to perhaps run a statistical test or generate a plot. Perhaps something that emphasizes functions, variables, DRY, testing would be useful to that group

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UMSWC/Administration/issues/22#issuecomment-238576347, or mute the thread https://github.com/notifications/unsubscribe-auth/AGG9HKPSWEpiCSBPAsvkD6xDHdFGNuIeks5qeJKjgaJpZM4JgEr1 .

arthur-e commented 8 years ago

@bsmith89 I did the Intro to R version (same folder) on Day 1 and then the advanced R stuff on Day 2.

marschmi commented 8 years ago

Thanks for the great discussion and thoughts on python, R and SQL.

How do people feel about the shell and git lessons? Shell and git have been extremely useful in my research and git has prevented me from losing my analyses when my computer has crashed on multiple occasions. However, if these 2 lessons were included it would make it a SWC workshop and not DC.

arthur-e commented 8 years ago

@marschmi Based on my conversations with other SNRE PhDs, I think version control would be pretty important for them to learn; can't tell you how many of them are pushing their luck without it (and how many have been burned). Also, only 1/22 SNRE PhDs/ Postdocs know Git or Mercurial.

We could do a standard SWC workshop but drop in a more advanced R lesson.

marcsze commented 8 years ago

I think so far everyone has raised some really good points. I think git is really important to teach since it becomes critical for reproducible research and large projects. It is also probably something that even those who have come to previous sessions or are comfortable with R would need a refresher on or don't know to much about.

Just my thoughts on Git.

bsmith89 commented 8 years ago

Oh! I had missed that this might be a DC workshop. If it's DC then you could replace the shell with SQL, do a full day of intermediate/advanced R, and do Git within R-studio or a different GUI.

I think that skipping the shell isn't so bad for R users, since R-studio packages some of the same functionality.

Is there an available lesson outline for GUI git? Can command-line git be taught without teaching shell?

marcsze commented 8 years ago

I think it is possible to teach git command line without teaching shell. However, it would be helpful if they knew basics of moving in and out of directories.

I'am not aware of any GUI git lesson outline from software carpentry.

singhal commented 8 years ago

Sorry late to join this conversation - I must have missed the memo about this GitHub.

I almost exclusively use Python for my work because it handles strings better and I work with DNA all the time. I suspect I'm not the only one in EEB. So, doing a Python workshop makes sense.

Arthur - great idea to poll your potential participants. What a great way to get buy-in!

arthur-e commented 8 years ago

To summarize the discussion thus far:

@marschmi made a good point about the end-of-August workshop that applies here, too. We should decide whether this is, broadly, an SWC-style or DC-style workshop, first, and then plan from there. Marian and I will be teaching this workshop but we appreciate any insight, particularly from SNRE- or EEB-affiliated persons.

marschmi commented 8 years ago

Hi Everyone, Thanks for all the input here. Below, I propose a first go at the schedule for this (seemingly SWC-style) workshop:

We had mentioned the idea of doing a data carpentry workshop here, however, it seems like the participants might be too seasoned with R for this to be useful? From the above conversation, it also appears that shell, git, and SQL could be helpful for this audience. I welcome more comments and ideas!

@singhal and @bsmith89 - I would love it if we could host a python flavored workshop soon. I'm hoping to learn these skills myself and would love to participate in this workshop! Maybe we could teach this in spring 2017? Or a python workshop could also be put in place for the January WISE workshop?

alixk commented 8 years ago

Just to chime in on python: a few participants from last week's workshop mentioned they were hoping to learn python. It would be great to either do a mini python-only workshop sometime soon, or to do a python flavored SWC soon (though I really hope to do DC even sooner!).

arthur-e commented 8 years ago

Thanks @marschmi for this update and suggested outline.

Re: "seasoned" R users, I think the Data Carpentry ecology materials (covering dplyr) would be new for a lot of SNRE's R users, who I think have "base R" down but are looking for advanced techniques. Note that if EEB users are already familiar with the DC material, we could add in things I contributed at the end on performance optimization, microbenchmarking, and line profiling, in addition to other topics.

However, if we do decide that Python is more useful here, that makes overall planning and marketing easier because then this becomes a canonical (Python-flavored) Software Carpentry workshop.

I guess it depends on how advanced we think the R users are. This is what I taught at the Hatcher library a couple of weeks ago, which borrows heavily from the DC "Ecology" lesson, and it was too advanced for most of the learners, in particular because they had just had the Intro to R the previous day (brain overload!). That said, I think the advanced learners in the room found it really interesting. It did get bogged down, however, because of the unequal level of preparation in the room (had to cover factors, logical vectors, assignment operators, and functions in addition to the more interesting stuff like dplyr and tidyr).

singhal commented 8 years ago

@marschmi I would be happy to organize a workshop centered on Python. It is my favorite of the topics to teach! It would have to be in January because I am moving away from A2 at the end of January. Should we check with WISE if they are okay if the workshop becomes Python focused?

Ideally, how far out from a workshop should the details get set?

bsmith89 commented 8 years ago

I am also very happy to teach a Python workshop (or other topics at a Python workshop). Perhaps @singhal and I could do a 2-half-day workshop on novice Python, and then another on intermediate topics sometime after that.

I think I would be able to do something like that in mid or late November. Thoughts?

singhal commented 8 years ago

@bsmith89 That would work for me! My only concern is that people might be busy both with the early rush of the holidays & the start of exam season, but two half-day sessions might ease that burden.