NewGraphEnvironment / fish_passage_fraser_2023_reporting

https://newgraphenvironment.github.io/fish_passage_fraser_2023_reporting/
Creative Commons Zero v1.0 Universal
0 stars 1 forks source link

`fishbc` data not up to date #75

Open lucy-schick opened 5 months ago

lucy-schick commented 5 months ago

Issue: While building the fiss_species_table.csv in extract-fiss-species-table.R I noticed that the results from fishbc::cdc are not up to date with https://species-registry.canada.ca/index-en.html#/species?ranges=British%20Columbia&sortBy=commonNameSort&sortDirection=asc&pageSize=10&keywords=sockeye.

Example: For example the Sockeye Salmon - Francois-Fraser-S Population is listed as a "Special concern" with COSEWIC but that does not show in the results form fishbc::cdc.

Screen Shot 2024-04-04 at 2 46 49 PM Screen Shot 2024-04-04 at 2 53 23 PM

Looks like fishbc pulls from BC Conservation Data Centre (CDC) and which is also up to date with Sockeye Salmon - Francois-Fraser-S Population listed as a "Special concern" .

Screen Shot 2024-04-04 at 3 12 51 PM

This is also true for the following populations in this projects scope (use the Nechako river):

What I found: Looks like the the cdc.csv provides the data (code here) has not been updated for 4 years... https://github.com/poissonconsulting/fishbc/blob/main/data-raw/cdc/cdc.csv

I could be totally wrong but it seems odd that this csv hasn't been updated in 4 years...

lucy-schick commented 2 months ago

Put a bunch of time into the issue and I've got it all working except for one step.

> usethis::use_data(cdc, overwrite = TRUE)
✔ Setting active project to '/Users/lucyschick/Projects/repo/fishbc'
✔ Saving 'cdc' to 'data/cdc.rda'
• Document your data (see 'https://r-pkgs.org/data.html')

Then

fishbc::cdc

but it still returns the old csv. I've tried restarting R and Rstudio, same issue. I've also tried running it in a different repo and still the same issue. There's a good chance I may be just doing something wrong so any ideas are welcome!

lucy-schick commented 2 months ago

Thought I found my issue but now not so much

When using fishbc:: it uses the old data. This makes sense I guess since I am explicitly calling the package? I think. but if I just call cdc it returns the new data, very obviously since that is an object.

But, Is there a way to call fishbc::cdc and it return the updated data? Yes I think so but I need to build the package locally to do so I think (using devtools like we did for local fpr functions). I thought that running usethis::use_data(cdc, overwrite = TRUE) would use the updated data when calling the package but I guess not... still not too sure about all this so any ideas are more than welcome. Was reading up on it here https://r-pkgs.org/data.html

Now the next question: how do we use the updated data in other projects? Options:

1) Build package locally so we can call fishbc::cdc and it uses the updated data OR

2) read in the updated cdc file in each project. scripts/02_reporting/extract-fiss-species-table.R creates the fish species table and burns it to 'data/inputs_extracted/fiss_species_table.csv' so not the end of the world to read in the cdc file because we only do this step once.

I'm going to go with option 2 for now because I know how to do that!

NewGraphEnvironment commented 2 months ago

Did you try "Build / Install / Clean and Install" in the Rstudio IDE by chance?

image

Build / Test first can be a good move too

Once it's loaded library(that_pkg) will load that installed package.

You can add issues to your fork of a repo using settings in github. Prob not a bad idea to do that and point to this one.

NewGraphEnvironment commented 2 months ago

another way to do it is to push branch to github and load with pak similiar to what we do with rbbt

pak::pkg_install("NewGraphEnvironment/rbbt@bbt_bib_result") - bbt_bib_result is the name of the branch

lucy-schick commented 2 months ago

Build problem fixed by using "Build / Install / Clean and Install" in the Rstudio IDE, thanks!

I realized the updated data was missing the dates in the COSEWIC and SARA columns so I went to go fix that and realzed that the Summary Data download form the cdc has all the info we need in one place to update cdc.csv (instead of having to join other downloaded results together) which is awesome but also needed a bit of work.

Everything is good to go now (finallyyy). I would be happy to make a PR to fishbc with my branch but maybe we could review what I've done so that it's in the best format for them. I'm not sure if they will want an Rmd file but I did my best to explain the whole updating process so that it will be quick in the future but not sure if thats what they will want... we can deal will this later once we have more time. For now I will just update the fish species tables in the reports.

Link to fishbc issue https://github.com/poissonconsulting/fishbc/issues/13