tavareshugo / 2019-01-29-EMBL

Into to R - Data Carpentry course EMBL 29-31 Jan 2019
https://tavareshugo.github.io/2019-01-29-EMBL/
Other
0 stars 1 forks source link

Instructor notes #2

Open tavareshugo opened 5 years ago

tavareshugo commented 5 years ago

To make it easier for us, the exercises for the course have been compiled here.

Outline of things to cover:


note: extra material for ggplot2 section

So that students intuitively understand factors, introduce them in the plotting section.

For example:

When doing this plot:

surveys_complete %>% 
  ggplot(aes(sex, hindfoot_length)) +
  geom_boxplot()

What if we want to change the order of the x-axis labels to be "M" first?

Then we need to learn about factors, which are a special way that R has to encode categorical variables.

Let's look at factors using a simple example first. Then go through the example of the course materials here, but only the very first section of it.

From there, jump back to the plotting problem and resolve it:

surveys_complete %>% 
  mutate(sex = factor(sex, levels = c("M", "F")))
  ggplot(aes(sex, hindfoot_length)) +
  geom_boxplot()

Exercise 3.4 applies this concept again.

tavareshugo commented 5 years ago

Day 1 and 2 - split materials:

Day 3 - split materials:

tavareshugo commented 5 years ago

Extras at the end:

theavanrossum commented 5 years ago

Day 1

  • Intro - EMBL, sticky notes, etherpad, code of conduct
  • Spreadsheets (Hugo)
  • Intro (Hugo)
  • data.frame + select + filter (Georg)
  • dplyr starting from pipes (Florian) (no gather/spread)

Day 2

  • review sticky notes from yesterday
  • dplyr (Florian) (no gather/spread)
  • ggplot2 (Thea) +factors + rmd
  • (gather/spread if there's time) (Hugo)
  • sql (Hugo) - 1 hr

Day 3 - split materials:

  • review sticky notes from yesterday
  • Intro + Variation within samples (+ gather/spread) (Thea)
  • Covariation + Properties of count data (Florian)
  • PCA + limitations of biplot (Georg)
  • Clustering (Hugo)
  • debrief & survey bus leaves for hdb at 5:30