geoglows / pygeoglows

A python package of tools coming from the GEOGloWS initiative
https://geoglows.org
BSD 3-Clause "New" or "Revised" License
16 stars 8 forks source link

geoglows.bias.correct_historical - ignore none reporting months #25

Open plgrover opened 2 years ago

plgrover commented 2 years ago

I tried using the geoglows.bias.correct_historical with data from Alberta Canada. Most of the gauges up north do not report during the winter months and so the historical record for those periods is null. As expected when I try and bias correct I get an error.

File /srv/conda/envs/notebook/lib/python3.9/site-packages/geoglows/bias.py:128, in _flow_and_probability_mapper(monthly_data, to_probability, to_flow, extrapolate)
    125     raise ValueError('You need to specify either to_probability or to_flow as True')
    127 # get maximum value to bound histogram
--> 128 max_val = math.ceil(np.max(monthly_data.max()))
    129 min_val = math.floor(np.min(monthly_data.min()))
    131 if max_val == min_val:

ValueError: cannot convert float NaN to integer

Would be great if the bias correction can ignore specified months or have an option to perform a bias correction not on a monthly basis.

rileyhales commented 2 years ago

can you provide the specific code that produced this error and some sample data?

plgrover commented 2 years ago

Below is the code - I am using a Water Survey of Canada station: 05DC001. Again this station does not report during the winter period. I guess I could insert non-zero values during that period to help with the bias correction code.

g05DC001_Df = pd.read_csv('g05DC001_Df.csv', parse_dates=['Datetime'], index_col='Datetime')
reach_id = 13021101

simulated_historical = geoglows.streamflow.historic_simulation(reach_id, forcing='era_5', return_format='csv')
simulated_historical.index = pd.to_datetime(simulated_historical.index)
simulated_historical[simulated_historical < 0] = 0
simulated_historical.index = simulated_historical.index.to_series().dt.strftime("%Y-%m-%d")
simulated_historical.index = pd.to_datetime(simulated_historical.index)

# bias correct
corrected_historical = geoglows.bias.correct_historical(simulated_historical, g05DC001_Df)
corrected_historical.index = pd.to_datetime(corrected_historical.index)
corrected_historical.index = corrected_historical.index.to_series().dt.strftime("%Y-%m-%d")
corrected_historical.index = pd.to_datetime(corrected_historical.index)

g05DC001_Df.csv