Closed Rhyst223 closed 3 years ago
Hi @Rhyst223 , glad to hear that you found fastKDE!
Unfortunately this is unavoidable due to one of the two 'curses of dimensionality' associated with this method (see Section 5 of O'Brien et al. (2016) for a discussion of this). The memory requirement is exponential with the number of variables: a 16-variable KDE would need something on the order of 100^16 bytes (1e32 bytes) of memory, which is more memory than would be available if all RAM chips in existence were able to be used.
O’Brien, T. A., K. Kashinath, N. R. Cavanaugh, W. D. Collins, and J. P. O’Brien, 2016: A fast and objective multidimensional kernel density estimation method: FastKDE. Comput. Stat. Data Anal., 101, 148–160, https://doi.org/10.1016/j.csda.2016.02.014.
You may need to consider using parametric methods to encode relationships among the variables. Unfortunately, fastKDE won't work in this case.
Best of luck!
Hi,
Thanks for this package it is great! However, I've been having some issues when applying to my own data. For anything more that 3 dimensions I keep getting RAM crashes (am using Google Colab) and was wondering if your package is feasible for my use case.
My goal: I have a data-frame with 16 dimensions and I want to fit a KDE which encodes the covariance between dimensions and then resample from said KDE a N number of times so I have samples with dimensions (N,16). Do you think this is possible with your package or is my number of dimensions just too large?
Reproduction with random data:
Running the above yields crashes my colab session.
Thanks!