The class SparseKDE is located at src/skmatter/utils/_sparsekde.py. It mitigates the high cost of doing KDE for large datasets by doing KDE for selected data points (e.g. grid points sampled by farthest point-sampling). This class takes the original dataset as a parameter and fits the model using the sampled grid points.
There are two auxiliary classes and some auxiliary functions of SparseKDE stored in src/skmatter/utils/_sparsekde.py.
Two distance metrics compatible with PBC, pairwise_euclidean_distances and pairwise_mahalanobis_distances, are realized and stored in src/skmatter/metrics/pairwise.py.
Tests for SparseKDE and some auxiliary functions are stored in tests/test_neighbors.py. Tests for distance metrics are stored in tests/test_metrics.py.
I am not sure if the current API of SparseKDE is OK and if the auxiliary classes should be integrated into SparseKDE. Also, SparseKDE seems to be too large and complex. Perhaps it needs to be decomposed into smaller parts, but I have not figured out how.
Contributor (creator of PR) checklist
[x] Tests updated (for new features and bugfixes)?
[x] Documentation updated (for new features)?
[ ] Issue referenced (for PRs that solve an issue)?
This PR introduces SparseKDE:
SparseKDE
is located atsrc/skmatter/utils/_sparsekde.py
. It mitigates the high cost of doing KDE for large datasets by doing KDE for selected data points (e.g. grid points sampled by farthest point-sampling). This class takes the original dataset as a parameter and fits the model using the sampled grid points.SparseKDE
stored insrc/skmatter/utils/_sparsekde.py
.pairwise_euclidean_distances
andpairwise_mahalanobis_distances
, are realized and stored insrc/skmatter/metrics/pairwise.py
.SparseKDE
and some auxiliary functions are stored intests/test_neighbors.py
. Tests for distance metrics are stored intests/test_metrics.py
.I am not sure if the current API of
SparseKDE
is OK and if the auxiliary classes should be integrated intoSparseKDE
. Also,SparseKDE
seems to be too large and complex. Perhaps it needs to be decomposed into smaller parts, but I have not figured out how.Contributor (creator of PR) checklist
For Reviewer
📚 Documentation preview 📚: https://scikit-matter--221.org.readthedocs.build/en/221/