Open tavareshugo opened 5 years ago
Day 1 and 2 - split materials:
Day 3 - split materials:
Extras at the end:
Rmd
at the end.Day 1
- Intro - EMBL, sticky notes, etherpad, code of conduct
- Spreadsheets (Hugo)
- Intro (Hugo)
- data.frame + select + filter (Georg)
- dplyr starting from pipes (Florian) (no gather/spread)
Day 2
- review sticky notes from yesterday
- dplyr (Florian) (no gather/spread)
- ggplot2 (Thea) +factors + rmd
- (gather/spread if there's time) (Hugo)
- sql (Hugo) - 1 hr
Day 3 - split materials:
- review sticky notes from yesterday
- Intro + Variation within samples (+ gather/spread) (Thea)
- Covariation + Properties of count data (Florian)
- PCA + limitations of biplot (Georg)
- Clustering (Hugo)
- debrief & survey bus leaves for hdb at 5:30
To make it easier for us, the exercises for the course have been compiled here.
Outline of things to cover:
read_csv()
from the beginning to simplify things? (optional, but I find it useful to keep consistency across the course)[rows, columns]
for subset and$
to access column. For example I personally avoid showing the 4 different ways to access a column listed in the materials (usually it's quite confusing for beginners)note: extra material for
ggplot2
sectionSo that students intuitively understand factors, introduce them in the plotting section.
For example:
When doing this plot:
What if we want to change the order of the x-axis labels to be "M" first?
Then we need to learn about factors, which are a special way that R has to encode categorical variables.
Let's look at factors using a simple example first. Then go through the example of the course materials here, but only the very first section of it.
From there, jump back to the plotting problem and resolve it:
Exercise 3.4 applies this concept again.