Closed PaulWAyers closed 2 years ago
Thanks for sharing! Looks very promising! @PaulWAyers @Ali-Tehrani
Paper on DPPy
can be found at tools
subfolder. I am reading more about determinantal point processes and hope we can employ it. #4
A practical example using DPP for diverse subset sampling can be found at Fast mixing Markov chains for strongly rayleigh measures, DPPs, and constrained sampling. @PaulWAyers @Ali-Tehrani
The strategy I was proposing for using kd-trees to make maximally diverse samples is not new. https://pubs-acs-org.libaccess.lib.mcmaster.ca/doi/abs/10.1021/ci980100c
J. Chem. Inf. Comput. Sci. 1999, 39, 1, 51–58
Subsumed by #7
@Ali-Tehrani suggested using determinantal point processes. There is code (Julia and Python that I found) for doing this https://github.com/theogf/DeterminantalPointProcesses.jl https://github.com/guilgautier/DPPy https://github.com/mbp28/determinantal-point-processes https://github.com/sverdoot/regularized-dpp
This last code implements this paper https://arxiv.org/pdf/1906.04133.pdf .
Some other papers are: http://proceedings.mlr.press/v99/derezinski19a/derezinski19a.pdf https://openreview.net/pdf?id=BkzBwNrlLS
This paper, which @Ali-Tehrani found, suggests that diverse sampling not only improves the performance (speed) but also the accuracy/robustness of kernel methods (based on the abstract, which is all I've read so far) https://arxiv.org/abs/2002.08616