The reference population used for standardisation needs to be available when running the reusable action. One option would be to retrieve reference populations on the fly from a reputable source (eg ONS) everytime the reusable action is run, but this isn't possible inside the server as most connections outside the server are denied.
This repo currently contains a locally-run script that gets mid-year population estimates from the ONS, stratified by 1-year age, sex, region, using the onsr package. This is for convenience -- it's possible to just go to the ONS website and download the equivalent dataset (but it would need some reshaping).
What are the other options?
For any given repo, get the reference population and stick it in the repo. This will then be available inside the server when the reusable action is run. The current repo contains a script for doing exactly that (it gets mid-year population estimates from the ONS, stratified by age-sex-region, using the onsr package). This is for convenience though -- it's possible to just go to the ONS website and download the equivalent dataset (but it would need some reshaping). This is probably fine, but slightly unsatisfactory as it doesn't provide any in-built guarantees that the reference population is assured / reliable / uncorrupted / etc.
Make a set of standard reference populations available from opensafely CLI, similar to how codelists are imported into repos using opensafely codelist. This gives us some in-built quality control but requires a bit of dev time.
Make reference populations available in an R package that can be loaded locally and in the server.
Probably the first option is ok for now, but having a set of standard, assured reference populations that can be automatically retrieved is a nice to have (stratified by ethnicity, european populations, etc).
The reference population used for standardisation needs to be available when running the reusable action. One option would be to retrieve reference populations on the fly from a reputable source (eg ONS) everytime the reusable action is run, but this isn't possible inside the server as most connections outside the server are denied.
This repo currently contains a locally-run script that gets mid-year population estimates from the ONS, stratified by 1-year age, sex, region, using the
onsr
package. This is for convenience -- it's possible to just go to the ONS website and download the equivalent dataset (but it would need some reshaping).What are the other options?
onsr
package). This is for convenience though -- it's possible to just go to the ONS website and download the equivalent dataset (but it would need some reshaping). This is probably fine, but slightly unsatisfactory as it doesn't provide any in-built guarantees that the reference population is assured / reliable / uncorrupted / etc.opensafely
CLI, similar to how codelists are imported into repos usingopensafely codelist
. This gives us some in-built quality control but requires a bit of dev time.Probably the first option is ok for now, but having a set of standard, assured reference populations that can be automatically retrieved is a nice to have (stratified by ethnicity, european populations, etc).