aai-institute / pyDVL

pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
https://pydvl.org
GNU Lesser General Public License v3.0
87 stars 9 forks source link

Interruptible samplers #321

Open mdbenito opened 1 year ago

mdbenito commented 1 year ago

319 introduces PermutationSampler but it does not include the possibility of interrupting the sampling within a permutation, as required for TCMS.

One possibility would be to make samplers not simple Iterables but coroutines, with __iter__ returning a Generator[NDArray, bool, None] that accepts sent booleans to interrupt the sampling of a permutation. With this (and given that caching is enabled, see the comment in semivalues.py, but this is unrelated to the interruption), permutation_montecarlo_shapley and semivalues with shapley_coefficient and PermutationSampler should be equivalent.

Additionally, stratified samplers might require either simple interruption, or information from the utility computations. For instance, an adaptive variance-reducing sampler might require the status (running moments?) for each stratum separately.

schroedk commented 1 week ago

@janosg potentially resolved by #558