Open banksad opened 1 year ago
This is what I've used before in case helpful (though undoubtedly not the best way and specific to monthly frequency data):
def get_ons_series(dataset, code):
url = f"https://api.ons.gov.uk/timeseries/{code}/dataset/{dataset}/data"
# Get the data from the ONS API:
json_data = requests.get(url).json()
title = json_data["description"]["title"]
df = (
pd.DataFrame(pd.json_normalize(json_data["months"]))
.assign(
date=lambda x: pd.to_datetime(x["date"]),
value=lambda x: pd.to_numeric(x["value"]),
)
.set_index("date")
)
df["title"] = title
return df
Have tested grab_ons_data.py and it works really well.
The OBR model only produces results on a quarterly basis (results are then aggregated to calendar and financial years). So I think we want to change the control flow at the end so it always picks up the quarterly only when both monthly and quarterly data are available.
e.g. for CPI (dataset MM23, cdid D7BT) there is quarterly and monthly data.
Will make a suggested change in this branch
OBR model uses CDID codes as identifiers in the model spec
These CDID codes can be used to query the ONS API
I have some code to do this from a previous project so can start collecting series gathered from the ons_identifiers.txt file