aai-institute / pyDVL

pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
https://pydvl.org
GNU Lesser General Public License v3.0
87 stars 9 forks source link

Implement Variance Reduced Data Shapley (VRDS) #275

Open AnesBenmerzoug opened 1 year ago

AnesBenmerzoug commented 1 year ago

Introduced in Wu, M., Jia, R., Huang, W., & Chang, X. (2022). Robust Data Valuation via Variance Reduced Data Shapley. arXiv preprint arXiv:2210.16835.

The idea is to use stratified sampling to reduce the variance of the estimated data Shapley values.

mdbenito commented 1 year ago

See #223

mdbenito commented 1 year ago

And, in case you want to test methods and start coding, maybe we can work together on https://github.com/appliedAI-Initiative/pyDVL/tree/feature/sampler