Open expectopatronum opened 4 years ago
I believe it does loop over the entire training set, the train_loader
is a python dataloader that carries all that info with it. We can see here: https://github.com/nimarb/pytorch_influence_functions/blob/4df5d2ec1baae38d70345740b7eca7466e3b48ef/pytorch_influence_functions/calc_influence_function.py#L133 that it is iterating over the dataset that gets sent along with the dataloader (which would be the entire training set).
Hi, @andrewsilva9 is correct in line 133 it loops over the entire training dataset so that one grad_z is calculated per training sample.
With the start
argument you can start at a different point in the training dataset. This can be used if you split the calculation across multiple machines. You calculate samples [0-100] on machine 1 and on machine two you pass start=101
to calculate from training sample [101-x]. The end x
is missing here in the implementation still...
Hi, I am not sure if I am misunderstanding the parameter or if it shouldn't be passed to
calc_grad_z
: https://github.com/nimarb/pytorch_influence_functions/blob/4df5d2ec1baae38d70345740b7eca7466e3b48ef/pytorch_influence_functions/calc_influence_function.py#L111 I assumed it should loop over the whole training set.Thanks and best regards Verena