Open R-Palazzo opened 1 year ago
Marking this as a feature request, as we can track this as a performance improvement.
Part of the issue is that for the purposes of Gaussian Copula computation, it is not enough just to fit a KDE. We also have to convert the distribution using a CDF (and back). We should probably profile all steps of this.
Environment details
Problem description
I'm looking to sample a 1D distribution using the
gaussian_kde
option of the parameterfield_distributions
ofGaussianCopula()
.real_data
is apd.Dataframe()
with only 1 column named 'Data'.When I run
It works, but it's exponentially longer than
GaussianCopula()
with default parameters. I tried different numbers of samples for thereal_data
and it's 50 to 200 times longer withgaussian_kde
. I also tried thegaussian_kde()
ofScipy
, and It's much faster to fit and sample from it. It's roughly the same time or a bit longer thanGaussianCopula()
with default parameters.