Open pepaaran opened 1 year ago
This is up next week in my lessons. Check some of the examples I put in. Good to flag this as this means that this is timely. There is an argument here for moving all this to the front, as well as the notions of file structures.
However, we decided against this as this is arguably rather dull and all - but key as it seems.
Maybe we can add some troubleshooting information in the exercise for data wrangling, but explain it more thoroughly in your data variety chapter. That would push the "boring" part to a later class.
Ideally we should teach them to not do this manually at all! Technically you can clean this file without touching the original (and the real danger of introducing untraceable errors on input-output).
# read in the data sheet S1
# skip 3 first rows
data <- readxl::read_xlsx(
"1249534s1-s6.xlsx",
sheet = "Database S1",
skip = 3
)
# drop any rows which don't have a complete
# citation (spacer rows)
data <- data |>
tidyr::drop_na(Citation)
# carry forward the labels in "Experiment"
data <- data |>
tidyr::fill(
Experiment,
.direction = "down" # state the fill direction explicitly
)
# all cleanup follows from here
Both drop_na
and fill
live in {tidyr}
Some students saved the file from the Exercises of Ch 3 into an Excel file and then
.csv
. When they did that, they saved with;
separated values and needed to use theread_csv2()
function (read_csv
only recognises,
separated values). Include this information somewhere in the tutorial.