Open lucy-schick opened 5 months ago
Put a bunch of time into the issue and I've got it all working except for one step.
https://github.com/lucy-schick/fishbc/blob/main/R/cdc.Rmd contains all the current instructions to update the cdc.csv
. I say current because hopefully in the future we may be able to pull directly from the API or find a better way to do this. There was still a significant amount of data wrangling to do.
all the updated csv's and rdata are also in my fork, https://github.com/lucy-schick/fishbc/tree/main/data-raw
STILL TO DO: actually run fishbc::cdc
and get the updated data. I've run:
> usethis::use_data(cdc, overwrite = TRUE)
✔ Setting active project to '/Users/lucyschick/Projects/repo/fishbc'
✔ Saving 'cdc' to 'data/cdc.rda'
• Document your data (see 'https://r-pkgs.org/data.html')
Then
fishbc::cdc
but it still returns the old csv. I've tried restarting R and Rstudio, same issue. I've also tried running it in a different repo and still the same issue. There's a good chance I may be just doing something wrong so any ideas are welcome!
Thought I found my issue but now not so much
When using fishbc::
it uses the old data. This makes sense I guess since I am explicitly calling the package? I think. but if I just call cdc
it returns the new data, very obviously since that is an object.
But, Is there a way to call fishbc::cdc
and it return the updated data? Yes I think so but I need to build the package locally to do so I think (using devtools
like we did for local fpr functions). I thought that running usethis::use_data(cdc, overwrite = TRUE)
would use the updated data when calling the package but I guess not... still not too sure about all this so any ideas are more than welcome. Was reading up on it here https://r-pkgs.org/data.html
Now the next question: how do we use the updated data in other projects? Options:
1) Build package locally so we can call fishbc::cdc
and it uses the updated data OR
2) read in the updated cdc
file in each project. scripts/02_reporting/extract-fiss-species-table.R
creates the fish species table and burns it to 'data/inputs_extracted/fiss_species_table.csv'
so not the end of the world to read in the cdc file because we only do this step once.
I'm going to go with option 2 for now because I know how to do that!
Did you try "Build / Install / Clean and Install" in the Rstudio IDE by chance?
Build / Test first can be a good move too
Once it's loaded library(that_pkg)
will load that installed package.
You can add issues to your fork of a repo using settings in github. Prob not a bad idea to do that and point to this one.
another way to do it is to push branch to github and load with pak similiar to what we do with rbbt
pak::pkg_install("NewGraphEnvironment/rbbt@bbt_bib_result")
- bbt_bib_result
is the name of the branch
Build problem fixed by using "Build / Install / Clean and Install" in the Rstudio IDE, thanks!
I realized the updated data was missing the dates in the COSEWIC and SARA columns so I went to go fix that and realzed that the Summary Data download form the cdc has all the info we need in one place to update cdc.csv (instead of having to join other downloaded results together) which is awesome but also needed a bit of work.
Everything is good to go now (finallyyy). I would be happy to make a PR to fishbc with my branch but maybe we could review what I've done so that it's in the best format for them. I'm not sure if they will want an Rmd file but I did my best to explain the whole updating process so that it will be quick in the future but not sure if thats what they will want... we can deal will this later once we have more time. For now I will just update the fish species tables in the reports.
Link to fishbc issue https://github.com/poissonconsulting/fishbc/issues/13
Issue: While building the
fiss_species_table.csv
inextract-fiss-species-table.R
I noticed that the results fromfishbc::cdc
are not up to date with https://species-registry.canada.ca/index-en.html#/species?ranges=British%20Columbia&sortBy=commonNameSort&sortDirection=asc&pageSize=10&keywords=sockeye.Example: For example the
Sockeye Salmon - Francois-Fraser-S Population
is listed as a "Special concern" with COSEWIC but that does not show in the results formfishbc::cdc
.Looks like
fishbc
pulls from BC Conservation Data Centre (CDC) and which is also up to date withSockeye Salmon - Francois-Fraser-S Population
listed as a "Special concern" .This is also true for the following populations in this projects scope (use the Nechako river):
What I found: Looks like the the
cdc.csv
provides the data (code here) has not been updated for 4 years... https://github.com/poissonconsulting/fishbc/blob/main/data-raw/cdc/cdc.csvI could be totally wrong but it seems odd that this csv hasn't been updated in 4 years...