jonathf / chaospy

Chaospy - Toolbox for performing uncertainty quantification.
https://chaospy.readthedocs.io/
MIT License
447 stars 87 forks source link

problems with burr distribtuion #100

Closed sronen71 closed 6 years ago

sronen71 commented 6 years ago
  1. burr definition in cores.py is Burr type III (aka inverse burr or dagum), which is ok and is same form as burr in scipy.stats. However it is not Burr XII (Singh Mandalla) as claimed in the inline comment in collections.py.
  2. Trying to use burr to calculate moments, samples etc gives error coming from _bnd method and exits. Reason seems to be that the method _bnd(self) in cores.py of burr class should have the c,d parameters as arguments, but it doesn't.
  3. After fixing the _bnd method to be _bnd(self, c, d), calculation of the expectation with cp.E gives wrong large number (5e9) instead of the correct mean. E.g for cp.E(cp.Burr(3,2))
jonathf commented 6 years ago

1 and 2: Okay, something when wrong in the implementation here. The distribution is not Burr XII, but correctly as you point out Burr III, and indeed the _bnd method is incorrectly implemented. I'll update both now.

3: That is a bit harder to deal with. The function cp.E approximate the expected value when an internal function _mom (function for computing raw statistical moments) is missing. The current default is to use quadrature integration using Clenshaw-Curtis which in the case of Burr clearly works poorly.

To get around this problem we ideally need to know how what the raw statistical moments are, or if those are not possible to retrieve, an better approach to calculate the moments.

I am quite unfamiliar with the Burr family, so if you want me to provide a fix, I will need help figuring out how to calculate the moments. Is that something you perhaps could provide?

sronen71 commented 6 years ago

Burr III is also called Dagum.
For X ~ cd(x(-c-1.0))*((1+x*(-c1.0))(-d-1.0)) The moments are given e.g. in http://www.ccsenet.org/journal/index.php/ijsp/article/view/64605 eq(5) as: E[x^k] = d*B(1-k/c,d+k/c) where B is the beta function. Wiki article on Clenshaw Curtis quadrature suggests transforming a semi infinite or infinite interval into a finite one via a simple coordinate transform. Perhaps it's a feasible option to implement for the numerical calculation? https://en.wikipedia.org/wiki/Clenshaw%E2%80%93Curtis_quadrature

jonathf commented 6 years ago

I've pushed your suggested moment function and the update to the bound function and docs to the development branch. If your interested, you may try it for yourself.

I took the expected value cp.E(cp.Burr(3, 2)) and got infinity. Was your reaction to getting the 5e9 value that it was too low, or am I'm missing something?

sronen71 commented 6 years ago

There is another defect in Burr() in collections.py . dist = cores.burr(c=1., d=1.)scale + loc Should be instead: dist = cores.burr(c, d)scale + loc

The expected results is cp.E(cp.Burr(3,2))= 1.6122661. I verified it gives the correct result after fixing the above.

P.s: perhaps consider using scipy.stats to replace your implementation of distributions? It also seems to have quadrature scipy.integrate.quad. Alternatively I think it could be interesting to add some of the functionality of the rest of your package to scipy.

E.g, scipy can compute the correct numerical mean of burr of its semi infinite domain: print scipy.integrate.quad(lambda x: x*scipy.stats.burr(3,2).pdf(x),0,np.inf)[0] gives: 1.61226610154 But also faster using moments directly: scipy.stats.burr(3,2).mean()

jonathf commented 6 years ago

You are quite right. Must have been tired Yesterday not to check the interface in collection. I've pushed the changes now, and it works as expected.

If I could use the scipy.stat as backend I probably would, but there are many constructs that I use related to polynomial chaos expansion that makes an interface difficult.

Thanks for providing the moement function BTW. Using analytical moments are preferable to numerical integration.

sronen71 commented 6 years ago

Great. Thank you!