jmschrei / apricot

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly. See the documentation page: https://apricot-select.readthedocs.io/en/latest/index.html
MIT License
497 stars 48 forks source link

[MRG] Reorganize code, add mixtures, add bidirectional #6

Closed jmschrei closed 4 years ago

jmschrei commented 4 years ago

This PR reorganizes the code to decouple the optimizer from the selection object. The goal is that the selection object should encode the function, calculate the gains, and store the selected subset. This allows one to code their own optimizers and apply it to any function, or to code the function without needing to know how to write an efficient optimizer. This PR also adds in mixtures of submodular functions and the bidirectional greedy algorithm for non-monotonic submodular functions.