ScotGovAnalysis / opendatascot

An R package to pull data from statistics.gov.scot into R
https://scotgovanalysis.github.io/opendatascot/
MIT License
47 stars 6 forks source link

Identifying modification of database #108

Closed Andrew-Saul closed 1 year ago

Andrew-Saul commented 1 year ago

Is there a way to determine when the database of interest eg. "Ante-Natal Smoking" was last modified within opendatascot? Is there a way to download a dataset that was modified on a certain day using the package?

GordonBryden commented 1 year ago

Not at present but I plan on adding a "get metadata" function to collect that and other data about a dataset.

GordonBryden commented 1 year ago

OK, so apparently I created ods_metadata in 2021, and forgot about it.

Your use case would be ods_metadata("smoking-at-booking") and then look at the value "modified" which will give you the time down to the nearest millisecond.

To find all datasets modified on a certain day, you would need to get the list of all datasets using ods_datasets(), apply ods_metadata() to the dataframe (using apply or purrr), and the filter the results.

GordonBryden commented 1 year ago

Code to find all datasets closed on a certain day:


library(dplyr)
library(opendatascot)
library(purrr)
library(lubridate)
library(tibble)

all_datasets<- ods_all_datasets()
datasets <- all_datasets$URI

meta_get <- function(uri) {
  ods_metadata(uri) %>%
    filter(metadata == "modified") %>%
    select(value) %>%
    pull() %>%
    return()
}

modified <- map(datasets, meta_get)

mod_tibble <- modified %>%
  unlist() %>%
  as_tibble() %>%
  mutate(value2 = ymd_hms(value),
         modified = as_date(value2))

all_datasets %>% 
  bind_cols(mod_tibble) %>%
  select(URI, modified) %>%
  filter(modified == "2022-12-14")