sympy / sympy

A computer algebra system written in pure Python
https://sympy.org/
Other
12.84k stars 4.4k forks source link

Can't fully integrate simple normal distribution, depending on mean and deviation constants. #21620

Open allComputableThings opened 3 years ago

allComputableThings commented 3 years ago

So... I can sympy.integrate a normal distribution with mean and standard deviation:

    (10.1, 0.333333333),  # Works fine

but not:

    (8.655555555555557, 0.5212875796916135), # Fails

Kinds feels like that shouldn't make a big difference. What's up with that?

Full example:

import numpy as np
%matplotlib inline
import sympy
from sympy import symbols
from sympy import plot

def normal(x, mean, sigma):
    z = (x - mean) / sigma
    return (1 / (sigma * sympy.sqrt(2 * sympy.pi))) * sympy.exp(-(z * z) / 2)

for μ,σ in [
    (10.1, 0.333333333),  # Works fine
    (8.655555555555557, 0.5212875796916135), # Fails
]:
    x = symbols("x")

    print(f"μ={μ}, σ={σ}")
    distrib = normal(x=x,mean=μ,sigma=σ)
    # distrib = sympy.simplify(distrib) # Doesn't help
    distrib_cum = sympy.integrate(distrib, x)
    # distrib_cum = sympy.simplify(distrib_cum) # Doesn't help
    print(distrib_cum)
    plot(distrib_cum)
NameError                                 Traceback (most recent call last)

<ipython-input-18-e85bdeabeaa6> in <module>
     22     # distrib_cum = sympy.simplify(distrib_cum) # Doesn't help
     23     print(distrib_cum)
---> 24     plot(distrib_cum)
     25 
     26 

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in plot(show, *args, **kwargs)
   1738     plots = Plot(*series, **kwargs)
   1739     if show:
-> 1740         plots.show()
   1741     return plots
   1742 

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in show(self)
    220             self._backend.close()
    221         self._backend = self.backend(self)
--> 222         self._backend.show()
    223 
    224     def save(self, path):

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in show(self)
   1414 
   1415     def show(self):
-> 1416         self.process_series()
   1417         #TODO after fixing https://github.com/ipython/ipython/issues/1255
   1418         # you can uncomment the next line and remove the pyplot.show() call

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in process_series(self)
   1411             if isinstance(self.parent, PlotGrid):
   1412                 parent = self.parent.args[i]
-> 1413             self._process_series(series, ax, parent)
   1414 
   1415     def show(self):

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in _process_series(self, series, ax, parent)
   1239             # Create the collections
   1240             if s.is_2Dline:
-> 1241                 collection = self.LineCollection(s.get_segments())
   1242                 ax.add_collection(collection)
   1243             elif s.is_contour:

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/plot.py in get_segments(self)
    704                     list_segments.append([p, q])
    705 
--> 706             f_start = f(self.start)
    707             f_end = f(self.end)
    708             sample(np.array([self.start, f_start]),

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/experimental_lambdify.py in __call__(self, args)
    173         try:
    174             #The result can be sympy.Float. Hence wrap it with complex type.
--> 175             result = complex(self.lambda_func(args))
    176             if abs(result.imag) > 1e-7 * abs(result):
    177                 return None

~/venv/p37_default/lib/python3.7/site-packages/sympy/plotting/experimental_lambdify.py in __call__(self, *args, **kwargs)
    270 
    271     def __call__(self, *args, **kwargs):
--> 272         return self.lambda_func(*args, **kwargs)
    273 
    274 

<string> in <lambda>(x0)

NameError: name 'Integral' is not defined

The first problem:

μ=10.1, σ=0.333333333
0.353553390593274*sqrt(2)*erf(2.12132034568096*x - 21.4253354913777)

works fine.

The second

μ=8.655555555555557, σ=0.5212875796916135
1.30203374788369e-60*sqrt(2)*Integral(exp(31.8522556903367*x)*exp(-1.83998909636091*x**2), x)/sqrt(pi)

Fails with 'NameError: name 'Integral' is not defined' when plotted. I'm aware I can make this 'go away' by importing sympy.Integral, but why does one get better evaluated to a simpler form that the other.

(And I'm a little suspicious about whether 1.30203374788369e-60* .... is going to end well).

Fails with the latest pip install as of today (python 3.7, sympy 1.18).

oscarbenjamin commented 3 years ago

Not sure of the underlying cause but it is to do with having floats in the integrand. This works:

nfloat(nsimplify(distrib).integrate(x))

My guess is that with floats somewhere some rounding causes factorisation to fail.

As an aside it would be better (in this particular case) to compute this integral with symbols and substitute the values in afterwards:

In [32]: integrate(exp(-((x - z)/t)**2), x)
Out[32]: 
        ⎛x - z⎞
√π⋅t⋅erf⎜─────⎟
        ⎝  t  ⎠
───────────────
       2       

In [33]: _.subs({z:1, t:3})
Out[33]: 
        ⎛x   1⎞
3⋅√π⋅erf⎜─ - ─⎟
        ⎝3   3⎠
───────────────
       2 
asmeurer commented 3 years ago

This is most likely related to the issue that SymPy does not complete the square inside the exponential. This has been brought up several times before, most recently at https://github.com/sympy/sympy/issues/21362 where some fixes are discussed. It's probably an issue with floats in particular because exp(constant) evaluates to a floating point number if constant is a float.