aai-institute / pyDVL

pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
https://pydvl.org
GNU Lesser General Public License v3.0
89 stars 9 forks source link

Some improvements to OOB notebook #431

Closed mdbenito closed 9 months ago

mdbenito commented 9 months ago

Description

This PR adds some text and supporting functions to the OOB notebook.

Changes

I also sneaked in a couple of unrelated things:

Checklist

mdbenito commented 9 months ago

@BastienZim I've worked a bit on your notebook, let me know if you have comments. I was a bit surprised by the very different results that one can obtain with different seeds, often obtaining a degradation of performance with the removal of the worst 20% points. In the end I added complete randomization of the whole run including the splitting of the dataset to see what the true variance is. It is a lot more, but things are predictable. I also added random seed handling to compute_oob and did a couple minor things here and there