rchlumsk / RavenR

R package for handling Raven hydrologic modelling framework inputs, outputs, and diagnostics
36 stars 16 forks source link

Monthly bias plot fix #103

Closed avelascoa closed 2 years ago

avelascoa commented 2 years ago

In the equation below, observed runoff is summed monthly even when there are days within a specific month with many NAN values. Besides, if all values within a month are NAN, then the result is 0 which results in "diff" with Inf values when using the flag normalize =TRUE due to the division by 0 and therefore no plot. obs.monthly <- apply.monthly(obs, sum, na.rm = TRUE)

The simulated runoff is also summed monthly, but there is no gaps in the daily data thus subtracting monthly simulated runoff from monthly observed runoff will always tend to be overestimated if months missing some days of observed data are included in the analysis. My suggestion is to include a flag to exclude the summation of observed months with incomplete data and printing a couple of lines to quickly check how many were included/excluded fromthe analysis.

rchlumsk commented 2 years ago

Hi Arturo, thanks for taking the time to make a commit to the package. I like your suggestion and agree with it in principle. Once I get a chance to review this and test it out, I will get back to you and merge in the change. Thanks so much

scantle commented 2 years ago

I haven't looked at the code, but in a recent project I inherited they had a "max_missing_days" argument which, in the summation to monthly, would make a monthly sum NA if it was missing more than that number of days. A flag works, or even keeping a column with the number of days used.

rchlumsk commented 2 years ago

Thanks @avelascoa for the contribution, and @scantle for the additional thoughts. A max_missing_days argument, and perhaps something that removes missing days first from both sim and obs before the monthly calculation, could be options. For now I have kept the flag as is to just include or exclude months with any missing days as a first step, but those would be improvements for sure.

Note that there was one typo with an "imcomplete_month" line that I fixed. I also changed the print statements to message() instead, and them conditional on the parameter incomplete_month being FALSE, otherwise all months are used and the statements are less informative.