Open barneydobson opened 7 months ago
It's UK only - but the rain gauge API is pretty easy to use and has 15 minute data.
https://environment.data.gov.uk/flood-monitoring/doc/rainfall
Figure out what station you want:
curl -X GET "https://environment.data.gov.uk/flood-monitoring/id/stations?parameter=rainfall" -H "accept: application/json"
Download the data:
curl -X GET "https://environment.data.gov.uk/flood-monitoring/id/measures/0890TH-rainfall-tipping_bucket_raingauge-t-15_min-mm/readings" -H "accept: application/json"
In Python it'd be something like this (though the more I look at the API, the more it looks like it only works for the last 100 days...
def uk_raingauge_downloader(bbox: tuple[float, float, float, float],
start_date: str = '2020-01-01',
end_date: str = '2020-01-05') -> pd.DataFrame:
"""Download precipitation data within bbox.
Args:
bbox (tuple): Bounding box coordinates in the format
(minx, miny, maxx, maxy).
start_date (str, optional): Start date. Defaults to '2015-01-01'.
end_date (str, optional): End date. Defaults to '2015-01-05'.
Returns:
df (DataFrame): DataFrame containing downloaded data.
"""
# Get the list of stations
url_root = "https://environment.data.gov.uk/flood-monitoring/id"
url_stations = url_root + "stations?parameter=rainfall"
response = requests.get(url_stations)
stations = response.json()['items']
stations = pd.DataFrame(stations)
# Remove stations without lat lon
stations = stations.dropna(subset=['lat', 'long'])
# Pick nearest station
# Calculate the distance of each station to the center of the bbox
stations['distance'] = ((stations['lat'] - bbox[1])**2 +
(stations['long'] - bbox[0])**2)**0.5
# Get the nearest station
nearest_station = stations.loc[
stations['distance'].idxmin()]
# Warn if no stations is within bbox
if not (nearest_station['lat'] >= bbox[1]) & \
(nearest_station['lat'] <= bbox[3]) & \
(nearest_station['long'] >= bbox[0]) & \
(nearest_station['long'] <= bbox[2]):
logger.warning(f"""No stations found within the provided bounding box.
Using the nearest station. At lat: {nearest_station.lat},
long: {nearest_station.long}, with station 'id' of
{nearest_station.stationReference}.""")
# Get data for station
url_station = f"{url_root}/stations/{nearest_station['stationReference']}/readings"
response = requests.get(url_station)
response = response.json()
# Download the data for the first measure of the selected station
url_data = f"{nearest_station['measures'][0]['@id']}/readings?startdate={start_date}&enddate={end_date}"
data_response = requests.get(url_data)
data_items = data_response.json()['items']
# Convert to DataFrame
df = pd.DataFrame(data_items)
return df
def download_precipitation(bbox: tuple[float, float, float, float],
start_date: str = '2015-01-01',
end_date: str = '2015-01-05',
username: str = '<your_username>',
api_key: str = '<your_api_key>') -> pd.DataFrame:
"""Download precipitation data within bbox.
"""
country = get_country(bbox[0], bbox[1])[2]
if country == 'GB':
return uk_raingauge_downloader(bbox, start_date, end_date)
else:
return cds_era5_downloader(bbox, start_date, end_date, username, api_key)
https://zenodo.org/records/8369987 may be useful (switching to this would close #20 )
Currently just using a simple design storm (since the precipitation downloader is broken). https://doi.org/10.3390/w15010046 show, and I'm sure the case is true in any high fidelity sewer network simulation paper of any kind, that results are super sensitive to which storm event is chosen. I have some code in the old repo for parameterised storm separation..
@cheginit and @tijanajovanovic rightly pointed out that spatial distribution of a storm event can be just as important as the event itself. There is a lot of literature evidence to support this. We will see how results look first, but methods can be discussed here.
In the first instance, it would be good to at least simulate two storms (possibly a pulse or design storm and a real storm) to see whether the sensitivity analysis results go mad or not.
The read/write should be handled under #84