derrickturk / aRpsDCA

R package for Arps decline curve analysis.
GNU Lesser General Public License v2.1
35 stars 19 forks source link

best.fit function and number of equation parameters #6

Open brzegorz opened 6 years ago

brzegorz commented 6 years ago

Hello,

As far as I know, when you fit model to data, the model with more variables tend to have better SSE regardless of it's actual viability. In the best fit function, you compare SSE of exponentia model with 2 parameters(D and qi), hyperbolic model(Di, qi, b) and hyp2exp model(Di, qi, b and Df). Doesn't that skew the function towards fitting of the hyp2exp model, especially at the cost of the exponential model? Shouldn't other criterion, such as Akaike information criterion or adjuster R^2, be better for choosing which model to return as "best"?

Thanks, Brzegorz

derrickturk commented 3 years ago

This is a good question and the entire functionality of best.fit is poorly thought out. In addition to the issue you correctly point out with the naive model comparison done by best.fit, squared-error based cost functions tend to work poorly in practice overall for the usual applications of decline-curve analysis, because they put a high premium on matching the high-rate production at early time (which has already happened, and thus isn't terribly useful to forecast for an economic evaluation) at the expense of fitting low-rate production at late time and into the future (which is the big "unknown" in an evaluation).

Unfortunately, I don't have time to address this in the near- or mid-term future, so I'm going to have to tag this issue as "enhancement" and put it into deep freeze until I have time to implement a better approach - or someone else steps up with an idea.

(Sorry for taking 3 years to respond to this issue!)