Canonical Method for FDC to Probability Computation

ghost commented 6 years ago

Given a decade of flows in variable x, discharges of interest, which could be x itself, and an offset to deal with zeros, and a generalization that an FDC is log-normal like in shape, below is my line of thinking on linear interpolation and extrapolation to the probabilities.


fdc2p_core <- function(x, q, offset=0.01) {
   n <- length(x)
   pnorm(approx(log10(sort(x)+offset), qnorm((1:n)/(n+1)),
           xout=log10(     q +offset), rule=2)$y)
}

I would be greatly interested in learning advice of the team about how the problem is to be framed. Lastly, the algorithm for Probability back to Flow is a little more delicate on the right-tail because of real extrapolation because flow is not bound to the open set (0,1) as probability is.

scworland commented 6 years ago

I have been using a loess model,

fit <- loess(q~f,data=est_fdc,span=0.2)

Q_est <- data.frame(date=donor_ep$date,
                    Q_est=round(predict(fit,donor_ep$ep),0))

Do you see a problem with that approach?

ghost commented 6 years ago

It is ambiguous on the handling of zeros and not relying on an interpolation scheme siding towards generalized linearity in log-qnorm space. The question of span could be resolved and I suspect the 0.2 is reasonable enough. loess itself does not guarantee monotonicity though sorting of the data does.

I was uncertain as to your thinking. I have often seen thinking not break towards log and I want to understand why. I also notice that your data is est_fdc but my thinking was along the lines of the Qp part where there is an obs_fdc. As in: Q[donor]-->P[obs via obs DV donor]==P[ungage]--->Q[ungaged via est_fdc].

Perhaps you are thinking then that I am inquiring on the "pQ" part but I wanted to start this important thread with "Qp."

Consider if we are feeding and FDC of nearly a decade long anyway and then turning around and estimating the probability for each of those points, then one solution is as simple as lmomco::pp(x, sort=FALSE) but if a donor gage's decade with in the x is defined as having some 60 missing days and we estimate those missing DVs by interpolation then inversion would require FDC having less than 3653 days so an algorithm is needed.

scworland / restore-2018

Canonical Method for FDC to Probability Computation #30