Closed wknoben closed 7 months ago
Thanks a lot for the suggestion. I will solve this issue as soon as possible.
Hi Dr. Knoben,
I have released version 0.0.9, which forces clean_streamflow before separation.
In the previous version, the load_streamflow function was required to be executed first, which implicitly ran clean_streamflow. The new version eliminates the need for load_streamflow and instead accepts a Pandas dataframe as input (the updated ReadMe).
However, current clean_streamflow removes missing values directly, instead of interpolating. Interpolation may not work well with long missing sequences. Given that digital filter would converge to similar values regardless of the initial value, the direct removal of missing values may not substantially impact the separation result.
Thanks again for your suggestion, and I would appreciate your thoughts on the updated version.
Sincerely, Jiaxin
Hi Jiaxin,
Thanks for looking into this!
It seems to me that both approaches (drop missing values and interpolate them) have the same issue in the sense that the values at the edges (just before and after the missing values) won't be very accurate. So presumably either is as good as things are going to get.
Thanks for letting me contribute further thoughts. My suggestions would be:
Thanks for your prompt response. The new version keeps the NaNs in the right places and returns a dataframe with the same shape as input streamflow. However, to ensure accuracy of recession constant estimates, the clean_streamflow would also drop years with streamflow records less than 120 days, resulting in additional NaNs.
Regarding the first thoughts, I will add a comment to the example code in the next release version. Thanks again for your feedback!
If any NaNs are present, the returned baseflow estimates contain only NaNs for all methods apart from
Fixed
andSlide
, and KGE values will be undefined for these 10 methods. MWE:Depending on what is considered intended behaviour, user-friendliness would be improved if either:
Nuances with approach 2 are that the edge values before and after NaNs are not necessarily accurate, and that the KGE function may need further updates to deal with NaNs in either time series.