dossgollin-lab / climate-data

Doss-Gollin Lab Climate Data Repository
4 stars 4 forks source link

HCFCD Data #12

Open jdossgollin opened 1 year ago

jdossgollin commented 1 year ago

It would be helpful to add tooling to download HCFCD rainfall (and stream gauge) data.

This data is available at https://www.harriscountyfws.org/ and data for a specific station can be accessed, often in increments of as little as 5 minutes. We should brainstorm the steps. I see this involving the following.

  1. Get a list of all available stations (and ideally what their start dates are)
  2. Figure out all the raw files that need to be downloaded (IMO should be roughly 1 file corresponds to 1 station x 1 time step. The time step should be the longest possible that gives us the right data. Maybe a month?)
  3. Download the "raw" data for each station
  4. Aggregate the "raw" data for each station (which will be in many different files) into a single file for that station. This will need to be re-run each time more raw data is added.
  5. Create an aggregated file indexed by time and station ID. I suggest a NetCDF4 file with two dimensions: time and location. Data variables would be longitude (indexed by location) and longitude (indexed by location). Appropriate metadata to indicate units, missing data, etc also needed.

This is a non-trivial task but a good problem for someone to hack on in their spare time.