DOI-USGS / loadflex

Models and Tools for Watershed Flux Estimates
http://dx.doi.org/10.1890/ES14-00517.1
Other
14 stars 17 forks source link

expand predictSolute to predict by time period #199

Open aappling-usgs opened 7 years ago

aappling-usgs commented 7 years ago

see also #174, which we resolved for batch mode but would like to correct more systematically throughout the package.

predictions need to happen across the full time period[s] of interest because they need to accommodate correlation among errors in estimates due to parameter uncertainty.

aappling-usgs commented 7 years ago

we could alternatively rewrite aggregateSolute to accept a model rather than predictions and to only include confidence intervals when it can do it well, by wrapping rloadest functionality or using approaches for interpolation/composite that embrace autocorrelation of errors better

aappling-usgs commented 7 years ago

Basic challenge: to produce a monthly or annual estimate, we need more information than just the predictions. But the current structure of the package separates instantaneous/unit predictions from aggregation in such a way that the information isn't available when we need it:

This mismatch exists because I didn't understand the uncertainty propagation problem completely enough 2 years ago. @wdwatkins, you and I have explored aspects of this problem since then. It's a different problem for each model type; I've backlogged the GitHub issues for fixing this problem for composite and interpolation and lm() models, but it's immediately fixable for loadReg2 models, and this issue is about restructuring a bit so that our fix for loadReg2 also paves the way for eventual fixes for the other model types.

I've proposed two possible solutions above but am leaning toward the first, which is consistent with the title of this issue: let's modify predictSolute to accept an argument that specifies the temporal resolution of interest and to return predictions at that resolution.

wdwatkins commented 7 years ago
wdwatkins commented 7 years ago

So it seems the mean water year and mean calendar year options will need to be eliminated, since those can't go into rloadest::predLoad? Or can we still incorporate those two options afterwords?

aappling-usgs commented 7 years ago

Hmm, yep, those are harder. For loadflexBatch we restricted the data to complete years and then used predLoad('total') - do you agree that approach is about as rigorous as we could hope for? The loadflexBatch code is at https://github.com/USGS-R/loadflexBatch/blob/master/batchHelperFunctions.R#L268. If it sounds like a good long-term solution to you, then the next question is how hard it would be to add that logic to predictSolute.loadReg2 - what do you think?

wdwatkins commented 7 years ago

Mm yeah I forgot that accomplishes the same thing. That should be doable, we might be able to just pull that code into a loadflex function so it stays in one place.