Closed kristian-georgiev closed 1 year ago
https://github.com/MadryLab/trak/commit/16e9d4627c41292a4b81a0d28962dbc42803239c incorporates https://github.com/MadryLab/trak/pull/43/commits/ee43d8ba0e7c7da4da932a06e5783fec609325b8 (block-wise get_scores
for large datasets).
https://github.com/MadryLab/trak/commit/62426eba866ff566cbf9ca9c28d12933ab9ffee6 incorporates https://github.com/MadryLab/trak/commit/efb67196a78dbd868801cf532c51504a68db2f6b (only write to disk once when scoring).
I left things inside of BasicScoreComputer
and changed the signature of get_scores
to use an accumulator to store the results, instead of making a new FastScoreComputer
.
https://github.com/MadryLab/trak/commit/259f087071a9dcf248e65727d3bb269ed563baea incorporates the rest of the enhancements.
@AlaaKhaddaj How does this plan sound: let's
get_matrix_mult_blockwise
and all its helper functions toutils.py
BasicScoreComputer
as it isget_matrix_mult_blockwise
in a new classFastScoreComputer
BasicScoreComputer
andFastScoreComputer
are the same so we don't addif
statements in thefinalize_features
method that calls themBasicScoreComputer
andFastScoreComputer
are functionally equivalentOnce we have all this, we can further optimize how we save&load the TRAK features and target gradients to reduce I/O latency.