Open maximilianpfau opened 4 years ago
Thanks for the question and the clear demonstration of the issue!
I looked into this and it turns out this is actually the intended behavior! The problem is that you are accessing the parameter values directly from the predicted distributions instead of from the provided dictionary. In other words, if you want the scale parameter s
for the first 10 observations you should do Y_dists[0:10].params['s']
, not Y_dists[1:10].s
. The class properties that encode the parameters are internal to those classes and should not be accessed by the user.
This is clearly a little confusing, so I do think this points towards doing one or two things:
sigma
, mu
parametrization of the lognormal, instead of the s = sigma
, scale = exp(mu)
parametrization that scipy.stats uses. This seems to be more commonly understood? @ryan-wolbeck @avati @tonyduan Thoughts?
@alejandroschuler I really like the idea of doing your first suggestion, it would then follow PEP 8. I don't really have an opinion on the second piece.
Thank you for the help!
@ryan-wolbeck are you interested in taking this on?
@alejandroschuler yeah I can do this, not sure when I'll get to it but I'll put it on the backlog
Thank you @ryan-wolbeck!
I also think that there is right now some confusion ... in particular, to clarifiy maybe what is was said by @maximilianpfau
.params['scale']
!= .scale
,
which is indeed a bit annoying (one has to figure that out by printing them )
Other than that, I also tested that
.dist.median
( = exp(.loc)
).dist.mean
( = exp( .loc + .params['s'] **2 / 2
)(similarly for the variance)
are indeed correct
Dear @alejandroschuler !
Thank you for the ngboost and you time earlier today.
Unfortunately, the output of regression with a LogNormal distribution appears to be incorrect.
https://colab.research.google.com/drive/1qr1l6h0NrD5PGYfF-FtiTnojb_FKEFAZ?usp=sharing
Specifically: Error 1: Retrival of the shape parameter "s" does not produce any output (cf. below) Error 2: Retrival of the scale parameter "scale" produces the wrong values (i.e., the "s" value from Y_dists.params) Error 3: The expression "np.exp(Y_dists.loc)" produces the "scale" values from Y_dists.params
Thank you for your input in advance, many thanks, Maximilian