Closed camilavargasp closed 3 months ago
hey all - I (Jeanette) ended up reaching out to Pascale directly to figure out what she meant and went ahead and put in a fix. the package goes and downloads a bunch of data from a couple of sources which have some inconsistent file naming. the two most recent years of data have a different filename structure so the function wasn't picking up on them. thats what Pascale meant by not updating - the new datasets weren't being retrieved. It was a very quick fix so I just did it
there are still a few cleanup tasks that need doing on the package despite the fix. I can create some issues in the repo - can either of you work on fixing them? should be pretty fast I (Jeanette) think: (Link to issues in package repo)
First I forked the repo because I don't have push access.
Then I found out which specific dataset is causing the parsing problem. It's dayflowcalculations2019.csv, which can be downloaded here (this url is from urls[[8]]
).
> dat <- lapply(urls[[8]], readr::read_csv, col_types=col_types, show_col_types = T, progress = T)
Rows: 379 Columns: 29
── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (29): Year, Mo, Date, SAC, YOLO, CSMR, MOKE, MISC, SJR, EAST, TOT, CCC, SWP, CVP, NBAQ, EXPORTS, GCD, PREC, MISDV, CD, XGEO, WEST, RIO, ...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Warning message:
One or more parsing issues, call `problems()` on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
On closer inspection:
> dayflowcalculations2019 <- read_csv("dayflowcalculations2019.csv")
Rows: 379 Columns: 29
── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (26): Year, Mo, Date, SAC, YOLO, CSMR, MOKE, MISC, SJR, EAST, TOT, CCC, SWP, CVP, NBAQ, EXPORTS, GCD, PREC, MISDV, CD, XGEO, WEST, RIO, ...
dbl (2): EFFEC, EFFDIV
num (1): X2
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Warning message:
One or more parsing issues, call `problems()` on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
> problems()
# A tibble: 379 × 5
row col expected actual file
<int> <int> <chr> <chr> <chr>
1 2 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
2 3 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
3 4 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
4 5 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
5 6 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
6 7 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
7 8 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
8 9 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
9 10 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
10 11 30 29 columns 30 columns /home/anchen/inundation/dayflowcalculations2019.csv
# ℹ 369 more rows
# ℹ Use `print(n = ...)` to see more rows
Need to investigate further next week 🔍
Edit: Wait, it seems like there are actually 2 different tables in this csv file! I'll let Jeanette know.
Jeanette let me know that she did not want that second table to make it into the integrated dataset so I had to do some special parsing on the file.
Opened a pull request here so that Jeanette can give me feedback.
The other issue (upgrade setup R step) has also been taken care of in that PR.
Jeanette approved of my changes so the PR is now closed and merged back into main
🎉 yay!! I think we're done with maintaining this package for now.
Jeanette -- the Delta crowd is having trouble with an R package you helped write. From an email thread, they sent Matt:
During the 2021 NCEAS-DSP SWG, I published a R package with Jeanette Clark (https://github.com/goertler/inundation). I was contacted by researchers trying to use it and it seems the part that Jeanette wrote is not updating. I'm not sure what they mean by "updating" but it is likely an R package maintenance issue.