Closed scoutsaachi closed 1 year ago
The simplest solution would be to require users to specify experiment name (exp_name
) in both start_scoring_checkpoint
and finalize_scores
. This way, target (gradient) arrays will be tied to an experiment name, and we don't need to rely on w+
for cleaning up between scoring different targets.
Unfortunately, this introduces a (minor) backward incompatibility --- running start_scoring_checkpoint
without exp_name
will error out.
To fully support dataset sharding, my guess is that we'll need to make minor changes to the train gradient stores as well. Can you provide a minimal test with the desired functionality (computing TRAK scores over different parts of the dataset in parallel). Thanks!
honestly for now I think just swapping w+ to r+ is enough. It is weird/unexpected that the behavior is different between featurize (where we do not overwrite) and score (where we do).
More to the point, the inds argument is useless if you are going to overwrite on every call.
(I will create a test, but this seems like a pretty simple fix just for the basics)
This line https://github.com/MadryLab/trak/blob/main/trak/traker.py#L268 should not zero out what is already in the target store. Change from 'w+' to 'r+'.
Also should add naming system so that you can score different targets