https://fingertips.phe.org.uk/profile/guidance/supporting-information/api
An API exists for this.
Python and R libraries also exist for this and don't seem to contain all of the API functionality.
There are multiple ways to retrieve data. What makes most sense for this project is to retrieive data for a group of indicator ids for one area_type_id
at a time.
import fingertips_py as ftp
data_for_multiple_ind_ids_for_one_area = ftp.retrieve_data.get_data_by_indicator_ids(indicator_ids=ids_as_str, # [Maximum 100]
area_type_id=area_type_id, # can be found in the documentation
include_sortable_time_periods=True, # includes an int format column for time period
is_test=False)
https://www.api.gov.uk/ons/open-geography-portal/#open-geography-portal
An API exists for this.
A data set follows the format: 'https://services1.arcgis.com/ESMARspQHYMw9BZ9/arcgis/rest/services/' + dataset_name + '/FeatureServer/0/query?outFields=&where=1%3D1&f=geojson'. e.g. https://services1.arcgis.com/ESMARspQHYMw9BZ9/arcgis/rest/services/Clinical_Commissioning_Groups_April_2019_Boundaries_EN_BUC_2022/FeatureServer/0/query?outFields=&where=1%3D1&f=geojson (Note - only FeatureServers can be downloaded, not MapServer)
dataset_names can be found in this directory: https://services1.arcgis.com/ESMARspQHYMw9BZ9/ArcGIS/rest/services
See the pipeline script.
indicator_id
s available at area_type_id
s is downloaded from the API.area_type_id
s, save all indicator_id
values that are available at these area_type_id
s. These are in chunks of 100 indicators per file as this is the APIs limit.area_type_id
.See the script which is currently not a module.
QlikSense dashboards are currently internal.
Data for an indicator is available for:
time_period_sortable
in the dataarea_type_id
. An area_type_id
is an area_type_class
at a specific area_type_year_configuration
. For example, area_type_id
301 is the LTLA area_type_class
at 2020 area_type_year_configuration
A unique indicator_dataset_id is made up of indicator_id
, time_period_sortable
, sex, age, and area_type_id
. Only a single indicator_dataset_id can be mapped at a time.
For an indicator_id
, we only want the latest area_type_year_configuration
for each area_type_class
and then we only want the latest time_period.
indicator_id
and area_type_class
, only keep the latest (max) area_type_year_configuration
.indicator_id
and area_type_class
, only keep the latest (max) time_period_sortable
.
Now for each unique indicator_id
and area_type_class
combination, there will only be 1 area_type_year_configuration
and 1 time_period_sortable
left.WHERE EXISTS
statements to only include what is left in previous steps.