maximenc / pycop

Python library for multivariate dependence modeling with Copulas
https://pypi.org/project/pycop/
MIT License
93 stars 19 forks source link

Speed up estimation up to 100x #6

Closed ioannisrpt closed 1 year ago

ioannisrpt commented 1 year ago

I see that the log likelihood function in estimation.py is defined using the Python sum and a list of the pseudo observations. However, one can achieve much better speed if numpy.sum is used on an a numpy.ndarray. Thus if we replace

  def log_likelihood(parameters):
      params = [parameters]
      logl = -sum([ np.log(copula.get_pdf(psd_obs[0][i],psd_obs[1][i],params)) for i in range(0,len(psd_obs[0]))])
      return logl

in estimation.py with

  def log_likelihood(parameters):
      params = [parameters]
      logl = -np.sum(np.log(copula.get_pdf(psd_obs[0],psd_obs[1],params)))
      return logl

we can achieve better performance. Hope that helps everyone enjoy pycop more.

ioannisrpt commented 1 year ago

Fyi, I have also included the jacobian in the minimization problem when 'SLSQP’ is used in an effort to further speed up the estimation but the difference is minimal. However, I can create a pull request with the jacobian included if you want.

maximenc commented 1 year ago

Hi @ioannisrpt, Thank you for bringing this to my attention! I have updated the changes to the “logl” function. The rest of the code has been adjusted accordingly.

I am open to any suggestions regarding the optimization algorithm.

I really appreciate your efforts in contributing to pycop.

Many thanks 🙂

ioannisrpt commented 1 year ago

@maximenc I'm glad that I helped! Including the Jacobian does not improve speed much so I think it is best not to include it because it will make the code bloated for no reason.