Closed jordansread closed 2 years ago
Oh yeah, and I checked that the old and the new file are identical.
They actually aren't, but that is because the order of the rows is a little different (probably because of the tibble order). After accounting for that, they are the same:
all.equal(readRDS('~/Downloads/UniversityofMissouri_LimnoProfiles_2017_2020_OLD.rds') %>% arrange(DateTime, time, depth, Missouri_ID), readRDS('7a_temp_coop_munge/tmp/UniversityofMissouri_LimnoProfiles_2017_2020.rds') %>% arrange(DateTime, time, depth, Missouri_ID))
[1] TRUE
Example multi-file parse with purrr and
case_when
.I made a few changes here to show an alternatively workflow that uses a tibble instead of separate vectors of file names. I think this is easier to understand, because you can look at the
files_tbl
and see where each file will be dispatched to:I also modified the functions to all return data.frames/tibbles with the same shape. The prior ones had some differences, which is why the
all_hw_files
had abind_rows
with dat_2017_hw, dat_2017_hw_092, dat_2018_hw and then amutate
. I collapsed a few things with shared arguments to make some of the processing functions a little more generic (usingget()
to pull a column name from the function argument and use that in themutate
call, for example).I used
case_when
to make the logic clear and in one place as to which file gets which handling function.Lastly, I used
purrr::pmap
to map over the rows of the data.frame and expose both the file name and the function handler. Then the function is called within that withexec
, which is basically a way to call the function when it is given as a string instead of an object.