Closed pnkraemer closed 4 years ago
It seems dangerous to me that the property and the setter have the same name
Why not return the empty dict here? I think there are settings in which you generally need the parameters of a distribution and hence use the parameters
property. If there are none, Raising an error might be painful and might inhibit some algorithms from terminating. Thoughts?
Agree that this is a valid to-do. IMO it should be done at the __init__
stage.
I personally am a big fan of not using maths-variables in code, i.e. no f
, no x
, no sigma
, etc.. In my opinion that increases readability significantly. On top of this, it is part of Pep8 (which I am fine with breaking occasionally, as it is more of a guide than a law).
Suggestion to replace x
here and in related functions: loc
, pt
, point
, samp
, sample
(the "sample" input to PDFs is part of the description at the wikipedia page of PDFs), event
. Thoughts?
I know that avoiding maths in scientific computing is tedious, but I have found it to be worthwile.
Replace hasattr
with is None
(I think the current version is an outdated, older version).
I find it more logical to read if
... elif
... else
or at least if
... elif
.
The same applies in the lines below in the code
If self._cdf
is not defined this would lead to an error.
About the previous one again: is "median" something that can be part of parameters
?
Is parameters
even required? It would be less type-checking and more concise if there was no parameters
at all, and if a distribution depends on parameters, this would be part of the distribution.
For instance:
class Normal(Distribution):
def __init__(self, mean, covar):
self._mean = mean
self._covar = covar
Distribution.__init__(self, _gausspdf, _gausslogpdf, _gausscdf, _gausslogcdf))
or something along these lines.
Is Distribution
intended to be subclassed or intended to be initialised with functions? The current implementation suggests the latter whereas the former is part of the docstring.
These kind of convenience methods just blow up classes, IMO. I think whoever is able to call covar
is able to turn a variance into a standard deviation. Also, here one has to be careful with multivariate distributions (-> Cholesky) and univariate distributions (-> sqrt). For this reason, I would suggest to take out std
.
If implementation of these dunder methods is supposed to be handled by subclasses, I would only write raise NotImplementedError
here. That makes it clear that the Distribution
class does not support addition, etc. by default.
IMO, this makes sense to be extracted into a function _scpstats_to_dist(...)
or something along these lines.
One thing that might be useful to think about is whether we restrict Diracs, Gaussians, etc. to be supported on (d,) arrays and make scalars etc. raise errors. Everybody can transform a scalar into an array, but it is more convenient to work with numpy arrays only (we can then use operations like @ and others).
Why set
parameters=None
as a default and notparameters={}
if this is transformed anyway?
Having a mutable object like dict
as an argument is an anti-pattern.
If implementation of these dunder methods is supposed to be handled by subclasses, I would only write
raise NotImplementedError
here. That makes it clear that theDistribution
class does not support addition, etc. by default.
Good idea. This will help in resolving #33.
It seems dangerous to me that the property and the setter have the same name
This is proper python 3 syntax. See https://docs.python.org/3/library/functions.html#property.
Hi, sorry to tune in like this, I just looked in this issue by accident.
If implementation of these dunder methods is supposed to be handled by subclasses, I would only write raise NotImplementedError here. That makes it clear that the Distribution class does not support addition, etc. by default.
The proper way to defer handling to other dunders is to return NotImplemented
, see https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy. (Also sorry if I read this out of context, I didn't read through the thread)
Thanks @JonathanWenger and @coldfix, I didnt know these things :)
Why not return the empty dict here? I think there are settings in which you generally need the parameters of a distribution and hence use the
parameters
property. If there are none, Raising an error might be painful and might inhibit some algorithms from terminating. Thoughts?
self._parameters
is initialized to an empty dict, so if it is None
something went wrong. Changed it to an AttributeError
.
I personally am a big fan of not using maths-variables in code, i.e. no
f
, nox
, nosigma
, etc.. In my opinion that increases readability significantly. On top of this, it is part of Pep8 (which I am fine with breaking occasionally, as it is more of a guide than a law).Suggestion to replace
x
here and in related functions:loc
,pt
,point
,samp
,sample
(the "sample" input to PDFs is part of the description at the wikipedia page of PDFs),event
. Thoughts?I know that avoiding maths in scientific computing is tedious, but I have found it to be worthwile.
I agree with you with the maths notation. However for things like pdf
I tried to follow scipy
convention. I am also not sure what term to use here, e.g. pdf(quantile)
or pdf(sample)
seem way more confusing to me.
If
self._cdf
is not defined this would lead to an error.
I would say this is desired fall-back behaviour. If the median is otherwise known, subclasses of distributions can override this method.
These kind of convenience methods just blow up classes, IMO. I think whoever is able to call
covar
is able to turn a variance into a standard deviation. Also, here one has to be careful with multivariate distributions (-> Cholesky) and univariate distributions (-> sqrt). For this reason, I would suggest to take outstd
.
I think convenience methods are good for inexperienced users. These will always raise an error unless var
is implemented by the sub-class and thus only available for 1d distributions.
IMO, this makes sense to be extracted into a function
_scpstats_to_dist(...)
or something along these lines.
✅
Issues needing no discussion are fixed in 151fa36.
https://github.com/probabilistic-numerics/probnum/blob/9a4e6cf4aee888ad447fe4228e576c7284aa90dd/src/probnum/probability/distribution.py#L68-L69
Why set
parameters=None
as a default and notparameters={}
if this is transformed anyway?