Closed mountainMath closed 2 years ago
A compromise would be to have the package emit a warning if the data API version changes, alerting the user to the possibility of recalled data, and a link to a script they can run to purge that data from their cache (and updated the data API version on non-recalled data). It's a compromise between implementing 2. and 3. in that the code that would go into the updated package for targeted cache purges could just be posted in the GitHub, and doing this with a warning rather than automatically keeps users in control of their cache.
I wonder what CRAN protocol is for having something that reads from real-time updated list of data issues and clarifications in an .onload or .onattach call.
Lets say every time you load the package the first time in a session, it reads from an external list we maintain with notifications like this. Sounds like something CRAN would dislike, though.
Current plan is:
null
This is now implemented on the server at https://censusmapper.ca/api/v1/recall.csv
Pushed some functionality for 4. and 5. To test, first grab some data that has been recalled via:
level="CSD"
regions <- list(CMA="59933")
deciles_2021 <- find_census_vectors("decile","CA21","Total") |>
slice(1) |>
child_census_vectors(leaves_only = TRUE)
data_2021 <- get_census("CA21",regions=regions,vectors=setNames(deciles_2021$vector,deciles_2021$label),
level=level)
Then list_recalled_cached_data()
lists cached data that has been recalled and remove_recalled_chached_data()
removes it from the local cache.
Additionally, get_recalled_database()
grabs the recall data from the CensusMapper server and caches it for the duration of the session.
New functions relating to data recall are in the recalled_data.R
file.
What's still missing is
list_recalled_cached_data()
at the beginning of each get_census
call and emit a warning if the returned nibble has a non-zero row count.get_census
, this will catch instances when data has been cached before the newer data versioning was implemented, as well as instances where the initial warning was ignored. Emit another warning if cached data used in the call has been recalled. Maybe add some colour or something to make it more visible.Both 2. and 3. are implemented now.
Addressed in upcoming v0.5.3.
StatCan has recalled several variables from the 2021 data release, and this has to be propagated up in CensusMapper and the cancensus package. Similar issues arose in the 2016 release, and the CensusMapper API as well as the cansim package now have built-in functionality for data versioning and better metadata for locally cached data, which allows for targeted recall of data.
But we have not established a clear workflow for doing this. On possible workflow is:
This will
What's still missing is clear logic to handle the cached data via cacensus. The package now keeps data version and metadata information which can be viewed via
list_cancensus_cache()
. At this point the cache needs to be invalidated manually, but the package should get updated to do this automatically. There are several ways to do this.