Closed erikbern closed 4 years ago
I agree, the original was a misname.
This makes it more in line with statsmodels Model.predict()
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/regression/linear_model.py#L354
I'm actually fine with the name cdf
, since it made sense statistically, but I get why it could be confusing, so not against this change.
My more opinionated thought -- the predict/cdf
function should output a list of samples for the case when MCMC is used, instead of defaulting to the CI. I think more weight should be on the end-user to make their own CI is they need, plus having the samples is use for posterior comparison between cohorts. I think @mbratten implemented a update here, but IMO it makes sense to incorporate it into the overall predict
function
Not opposed to the change but one thing I liked about the name cdf
was the fact that it implied that you could have the model return the functions entire range if you just supplied a vector t
rather than a singular value. This actually helped me understand what was going on a lot more since initially when I read through some of the documentation my thought was if there was a predict
function it would return the value C
as it was sort of the logical thing the model was attempting to capture.
My more opinionated thought -- the
predict/cdf
function should output a list of samples for the case when MCMC is used, instead of defaulting to the CI.
This functionality would be useful. I'm currently trying to get confidence intervals for difference in predictions and I have to redo the cdf function to get them. Would be nice to have this option!
This functionality would be useful. I'm currently trying to get confidence intervals for difference in predictions and I have to redo the cdf function to get them. Would be nice to have this option!
I think this is what @mbratten did in https://github.com/better/convoys/pull/123
When I think about it, I'm not sure if that should be called predict_posterior
though. Maybe I'll rename it posterior_samples
or similar
I'm tempted to remove the ci
flag on the predict
method and make it a separate (third) method besides predict
and predict_posterior
. What do you think?
Maybe I'll do that in a separate PR
From docs after this:
The
cdf
name was an attempt to make it more similar to the probability distributions inscipy.stats
but in retrospect that was confusing since the convoys models do not represent probability distributions – they are more like regression models that output something between 0 and 1.This deprecates the old method names but keeps them for the time being.
EDIT: also splits the prediction method into
predict
vspredict_ci
along the lines of https://martinfowler.com/bliki/FlagArgument.html