Open aleksandra-kim opened 6 years ago
Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).
If you actually use the scipy distributions, you do get an warning (though it seems like the result may still be correct):
#!python
In [2]: stats.triang(c=0)
Out[2]: <scipy.stats._distn_infrastructure.rv_frozen at 0x10f06d320>
In [3]: stats.triang(c=0).pdf(1)
/Users/cmutel/miniconda3/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:5243: RuntimeWarning: divide by zero encountered in true_divide
return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
Out[3]: 0.0
Coudl you please check and make sure that the returned values for pdf
, ppf
, and cdf
are still correct?
In the line ((params['loc'] >= params['maximum']).sum()
, ((params['loc'] >= params['maximum'])
will be an array of booleans. When we sum
this array, we get the number of times this condition was true. We then test the "truthiness" of this sum (0 is false, everything else is true) to find out if there were any cases where a value was outside the defined bounds.
Original comment by Pascal Lesage (Bitbucket: MPa, ).
I have checked both situations with mode==min (c=0) and mode==max (c=1) for extreme values (0, 1) and an intermediate value (0.5).
Scipy complains often but nevertheless returns returns the correct value except in one case: the pdf(1) of the case where mode==max gives nan
, should give 2.
#!python
In [1]: from scipy.stats import triang
In [2]: test_values = [0, 0.5, 1]
In [3]: triang_c0 = triang(c=0)
In [4]: for v in test_values:
...: print("pdf with {}: {}".format(v, triang_c0.pdf(v)))
...:
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4742: RuntimeWarning: invalid value encountered in true_divide
return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
pdf with 0: 2.0
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4742: RuntimeWarning: divide by zero encountered in true_divide
return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
pdf with 0.5: 1.0
pdf with 1: 0.0
In [5]: for v in test_values:
...: print("ppf with {}: {}".format(v, triang_c0.ppf(v)))
...:
ppf with 0: 0.0
ppf with 0.5: 0.2928932188134524
ppf with 1: 1.0
In [6]: for v in test_values:
...: print("cdf with {}: {}".format(v, triang_c0.cdf(v)))
...:
cdf with 0: 0.0
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4745: RuntimeWarning: divide by zero encountered in true_divide
return np.where(x < c, x*x/c, (x*x-2*x+c)/(c-1))
cdf with 0.5: 0.75
cdf with 1: 1.0
In [7]: triang_c1 = triang(c=1)
In [8]: for v in test_values:
...: print("pdf with {}: {}".format(v, triang_c1.pdf(v)))
...:
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4742: RuntimeWarning: divide by zero encountered in true_divide
return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
pdf with 0: 0.0
pdf with 0.5: 1.0
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4742: RuntimeWarning: invalid value encountered in true_divide
return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
pdf with 1: nan
In [9]: for v in test_values:
...: print("pdf with {}: {}".format(v, triang_c1.ppf(v)))
...:
pdf with 0: 0.0
pdf with 0.5: 0.7071067811865476
pdf with 1: 1.0
In [10]: for v in test_values:
...: print("pdf with {}: {}".format(v, triang_c1.cdf(v)))
...:
pdf with 0: 0.0
C:\mypy\anaconda\lib\site-packages\scipy\stats\_continuous_distns.py:4745: RuntimeWarning: divide by zero encountered in true_divide
return np.where(x < c, x*x/c, (x*x-2*x+c)/(c-1))
pdf with 0.5: 0.25
pdf with 1: 1.0
Original comment by Pascal Lesage (Bitbucket: MPa, ).
This is actually all fixed now with the new update to SciPy (weird how we find this now!)
The code in the error reported above: return np.where(x < c, 2*x/c, 2*(1-x)/(1-c))
is no longer in SciPy and has been replaced with return np.where(q < c, np.sqrt(c * q), 1-np.sqrt((1-c) * (1-q)))
I don't know if this was there before, but they also now explicitly address the two border conditions here
So the solution would be:
1) Have the later version of update in SciPy (1.1.0) in the requirements for stats_arrays
2) Change <=
and >=
simple inequalities in the stats_arrays code mentioned in the original issue report.
I'll do this now.
Original comment by Pascal Lesage (Bitbucket: MPa, ).
This fixes it: https://bitbucket.org/cmutel/stats_arrays/pull-requests/2/allow-mode-min-or-max-bounds-for/diff
Original report by Pascal Lesage (Bitbucket: MPa, ).
The validation of TriangularUncertainty will raise an
ImproperBoundsError
if the minimum or the maximum are equal to the mode.However, this should not be. In scipy (the underlying package for the
TriangularUncertainty
class), it is ok to pass a shape parameterc
= 0:returns array([0.58796552, 0.37544352, 0.13834934])
or a shape parameter
c
= 1:returns array([0.90142956, 0.30985289, 0.82816732])
numpy is also able to have a
left
or aright
parameter equal to themode
parameter (not shown here, trust me).The solution may just be to remove the equality sign in the assertion:
if ((params['loc'] > params['maximum']).sum() or (params['loc'] < params['minimum']).sum()):
rather thanif ((params['loc'] >= params['maximum']).sum() or (params['loc'] <= params['minimum']).sum()):
but the.sum()
is throwing me off...