Open narlesky opened 2 years ago
@narlesky That is a good question. I asked Rob and he also could not remember why he used a value of 0.9
in the at_hour == 22
, at_hour == 1
elif
statements.
Thinking about what the function is intended to do, I think you found an issue worth exploring. For example the same logic could be rewritten like:
elif at_hour == 22:
reapportion = daily_flow - 1.9*value_hourly
elif at_hour == 23:
reapportion = daily_flow - 2*value_hourly
which does not make much sense. I will test and get back to you.
Thanks @danhamill!
@narlesky Things have been busy but I have made a little progress simplifying the code. See Spline_from_Pandas.py
The first thing I wanted to get rid of was all the faux-date business. I think treating everything as a pd.Period
will avoid any date overflow issues with pd.Timestamp
.
The next thing I was working on was getting away from the text file inputs and outputs. My goal is to read and write directly from dss. The spline
function in Spline_from_Pandas
accepts a dataframe with two columns. Right now I have the flow column hard coded, which is undesirable. Passing a dataframe implies I am reading from dss outside of the method, but I think we could make this an object to simplify the usage of the algorithm. Here is how I was reading in the time series:
from pydsstools.heclib.dss import HecDss
from core.Spline_from_Pandas import spline
def get_daily_ts(path, dss_file):
with HecDss.Open(dss_file) as fid:
ts = fid.read_ts(path, trim_missing=False)
times = ts.pytimes
values = ts.values
tmp = pd.DataFrame(data = {'date': times, 'flow':values.copy(), 'site':path.split("/")[2]})
tmp.date = tmp.date - pd.DateOffset(hours = 1)
return tmp
df_daily = get_daily_ts(path, file))
df_hourly = spline(pivot)
Another thing I have yet to tackle is to investigate other types of splines. This:
spline_function = interpolate.splrep(masked_dates.compressed(),
hourly_accumulation.dropna().values, s=0)
is by no means optimal. This type of spline has to go down before it goes up. When you run the spline on an accumulation time series, going down results in negative flows when the interpolated hourly accumulation function is converted back to a hydrograph using successive differences.
Lately I have been working on local flows which aren't the best to test with since there can be large portions of the time series where there aren't any local flows.
@danhamill thanks for the update! What a funny coincidence that I had just visited this and the CriticalDuration repos earlier today and starred them and shared to our GitHub org! I definitely understand the balancing act between text and DSS inputs. We are still working with CSVs a decent portion of the time but I can see why the pydsstools integration would be so valuable!
Haha. I saw all the activity on git and remembered I hadn't committed lately. I was mid-way through an update and got sidetracked.
Well it's much appreciated!
@narlesky off topic, but I came across Qiusheng Wu's github profile and he has some really great open source GIS repositories. Check out leafmap and geemap.
@danhamill Oh cool! I used to work on a project with a similar theme based on Mapbox. I'll have to check it out!
Hi @danhamill thanks for sharing this repo! It's great to see this algorithm implemented in Python -- so much more accessible!!
As I was reading through the peak insertion logic, I had a question about the calculation for how the daily volume is reapportioned when a new peak is added. Does this make sense as a way to express what you've written? I was curious about the origin of the 0.9 factor -- is that an approximation of 22 / 24? Thanks in advance! - Carly