maximenc / pycop

Python library for multivariate dependence modeling with Copulas
https://pypi.org/project/pycop/
MIT License
93 stars 19 forks source link

Numerical problem for Joe copula at high theta? #10

Closed carolinamattsson closed 1 year ago

carolinamattsson commented 1 year ago

simu_archimedean("joe", n, m, theta) produces vectors with values of 1.000 when theta is brought above 6 or so. For higher values of theta this becomes a real problem because the marginals drift well away from uniform (see attached plots).

My suspicion is that the sampling procedure is encountering numerical issues, but I am not familiar enough with numerical methods to suggest a solution. What I can do is point to this blog post where the author seems to have run into a similar issue for the Joe copula, specifically, and solved it by adjusting the PDF function in some way: https://medium.com/@financialnoob/introduction-to-copulas-part-2-9de74010ed87. Hopefully it can be helpful, I'd really love to use this package in my work.

joe_theta5

joe_theta20

carolinamattsson commented 1 year ago

Well, OK, perhaps this is a general computational limit with the Joe copula? I will use the reversed Clayton instead as it seems similar. Numerical issues also arise here, but not until very very high correlations (theta>60).

maximenc commented 1 year ago

Hi @carolinamattsson, Thanks for pointing this out. Have you investigated it further?

My first spuspition is that the PDF starts to become degenerate when the theta is really high, but I would have to look at the behavior of the function at high theta.

carolinamattsson commented 1 year ago

Hi @maximenc,

I've explored what happens when I push the limits, in terms of producing 1.00s in the output vector. Using simu_archimedean("joe", 2, 10000, theta) reliably generates vectors with no 1.00s while theta<5. There are a handful of 1.00s when theta is 5, dozens of 1.00s when theta is 6/7/8, and hundreds of 1.00s when theta>8. Implementing the approach linked in my first comment was actually worse; the issue is mentioned but apparently they don't actually have a fix. You could be right that the PDF becomes degenerate... I gave up and am using a different copula.

For my purposes there's little to say whether a Joe or a (reversed) Clayton would be better, I just need to be able to push the coupling strength higher for some computational experiments. Using simu_archimedean("clayton", 2, 10000, theta) reliably generates vectors with no 0.00s while theta<65. By sampling several times I can push it to theta<80. That's definitely enough for me!

Thanks again for the useful package.

-Carolina