datacarpentry / semester-biology

Forkable teaching materials for course on working with data in R
http://datacarpentry.org/semester-biology
Other
76 stars 111 forks source link

Add optional material on storing all data files to looping over files #945

Open ethanwhite opened 2 years ago

ethanwhite commented 2 years ago

From my comment on YouTube:

1) make data a list and store each data frame loaded by the loop as one position in that list, so the first time through the loop the data frame gets stored in the first position in the list, the second time through the loop it gets stored in the second position, and so on. You can see some of the basic ideas in our Looping By Index video: https://www.youtube.com/watch?v=vWj5rypEZ4U&t=9s The modified version of the code would look like:

data <- vector(mode = "list", length = length(data_files))
for (i in 1:length(data_files)){
  df <- read.csv(data_files[i])
  data[[i]] <- df
}

2) Load all of the data into a single data frame. Technically you can do this using a loop by each time through the loop loading the new data and then appending it (using rbind) to the bottom of the data frame, but it's now easier to do it with a single command from the purrr package:

data <- map_dfr(data_files, read.csv)