kohpangwei / influence-release

MIT License
777 stars 175 forks source link

Maybe a bug: retrain without re-initialization? #16

Closed tengerye closed 5 years ago

tengerye commented 5 years ago

Hi, when I read the code, I noticed the parameters are only initialized at the creation of a model. In other words, the retrain() actually continue training without re-initialization. Is that alright?

expectopatronum commented 5 years ago

Nice catch! This is not clear from the paper, the only mention that they retrain from already trained parameters is made in the last sentence in the caption for Figure 2.

image

tengerye commented 5 years ago

@expectopatronum @kohpangwei But I don't think that is right. If we want to calculate the true difference, we should kick that training example and train from scratch according to my understanding.

expectopatronum commented 5 years ago

You might be right, and this was also what I thought it does. I am just reporting what I found in the paper.

kohpangwei commented 5 years ago

For convex models, it does not matter (in theory) whether we retrain from a warm start or cold start. For non-convex models, it does matter, and in general we can expect that retraining from a cold start would give quite different results (e.g., due to different random initializations). This variance will generally swamp the effect of removing a single training point. To get around this issue, we retrain from the initial learned parameters \tilde{\theta}, as @expectopatronum pointed out.

When removing groups of examples, we might expect that retraining from a cold start might give similar results, but AFAIK this has not been tested systematically. See https://arxiv.org/abs/1810.03611 and https://arxiv.org/abs/1905.13289 for more details if you're interested!

tengerye commented 5 years ago

@kohpangwei Got it. Thank you so much. @expectopatronum Thank you for your kind reply, too.