DanielBok / copulae

Multivariate data modelling with Copulas in Python
https://copulae.readthedocs.io/en/latest/
MIT License
143 stars 26 forks source link

Frank Copula Crash + Other Issues #29

Closed kaleb-keny closed 3 years ago

kaleb-keny commented 3 years ago

Hey @DanielBok , I was experimenting with the package here, I've installed the latest from conda. I have this dataset link which has a dimension of 2, being mostly gamma or pareto marginals.

Running the below code:

from copulae import  FrankCopula
distTypes = ['gamma', 'gamma']
df = pd.read_csv("C:/...")
cop = MarginalCopula(FrankCopula(dim=2), distTypes)
cop.fit(df)

Gives back tracebackerror, where I inserted a print statement in params to see what is triggering the assertion:

31.92679324563328
31.92679324563328
31.92679324563328
31.926793260633282
1028.2277099641103
1028.2277099791104
nan
Traceback (most recent call last):

  File "<ipython-input-15-5f9a10d4956d>", line 2, in <module>
    cop.fit(self.df)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\marginal\marginal.py", line 143, in fit
    self._copula.fit(data, x0, method, optim_options, ties, **kwargs)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\base.py", line 112, in fit
    CopulaEstimator(self, data, x0=x0, method=method, verbose=verbose, optim_options=optim_options)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\estimator\estimator.py", line 65, in __init__
    self.fit()  # fit the copula

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\estimator\estimator.py", line 70, in fit
    MaxLikelihoodEstimator(self.copula, self.data, self.initial_params, self.options, self.verbose).fit(m)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\estimator\max_likelihood.py", line 55, in fit
    res: OptimizeResult = minimize(self.copula_log_lik, self.initial_params, **self.optim_options)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\_minimize.py", line 626, in minimize
    constraints, callback=callback, **options)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\slsqp.py", line 422, in _minimize_slsqp
    fx = sf.fun(x)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\_differentiable_functions.py", line 182, in fun
    self._update_fun()

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\_differentiable_functions.py", line 166, in _update_fun
    self._update_fun_impl()

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\_differentiable_functions.py", line 73, in update_fun
    self.f = fun_wrapped(self.x)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\scipy\optimize\_differentiable_functions.py", line 70, in fun_wrapped
    return fun(x, *args)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\estimator\max_likelihood.py", line 87, in copula_log_lik
    return -self.copula.log_lik(self.data, to_pobs=False)

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\copula\base.py", line 155, in log_lik
    return self.pdf(data, log=True).sum()

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\utility\utils.py", line 36, in internal
    res = np.asarray(f(cls, x, *args, **kwargs))

  File "C:\kaleb\miniconda\envs\eth_proba\lib\site-packages\copulae\archimedean\frank.py", line 123, in pdf
    assert not np.isnan(self.params), "Copula must have parameters to calculate parameters"

AssertionError: Copula must have parameters to calculate parameters

I am not sure if it is the way I am running the package that is the issue. Aside from that with other copulas, I've tried generating random samples and plotting them on a chart, that results are surprising. Maybe I am not using the package wrong. One example below which uses the clayton copula gave back results that don't seem to fit what you see empirically, although maybe clayton should be able to capture the degree of relation between the 2 rvs. image

image

DanielBok commented 3 years ago

Ah @kaleb-keny! Sorry I missed your post! (It's a busy time at work before we wind down for Christmas)

Let me take a look at it over the weekend and get back to you on this

kaleb-keny commented 3 years ago

Hey, I just want to thank you as well for making this package... I am really impressed by the high quality of of coding that went into it... Also the use of numpy helps speed up the calculation by a large extent compared to other copula packages on python.... So in the end, I just want to put an update on this, I did end up using a GumbelCopula which worked pretty well on the data...

DanielBok commented 3 years ago

I think the issue should be fixed with version 0.7.1

kaleb-keny commented 3 years ago

Yeah I it ran without issue.. Although 😅 I will need to do some experimentation, as I just can't seem to get the same level of linearity as the one obtained empirically... thank you though, appreciate that you build this package and maintained it for us python users...