liyunsheng13 / BDL

MIT License
222 stars 30 forks source link

Which model selected for next iterations? #10

Closed river-afk closed 5 years ago

river-afk commented 5 years ago

Hello @liyunsheng13 , thank you very much for the code. I have questions on Algorithm 1 in your paper:

How do you select which M^k_i model (trained with Eqn 3) that will be used for the next iteration. I imagine that you validate all snapshots of M^k_i (saved during training) and pick the best? Similar question for F^k (trained with Eqn 2) and M^k_0 (trained with Eqn 1)

Thanks

liyunsheng13 commented 5 years ago

It is a good question. It is really hard to decide which model can be used for next iteration. But I don't validate all models and pick the best one. I train the model with 120000 iterations and only use the last model.

river-afk commented 5 years ago

Thanks for the quick reply. Very interesting.

It means that the last model always performs better than the initial one from the previous iteration. Does it hold true in all of your experiments? Do you always observe stable results using the last models, i.e. train the framework multiple times and achieve consistent improvement after each SSL iteration?

liyunsheng13 commented 5 years ago

Actually, the last model (120000 iterations for SSL) is not always the best but almost the best one. But I think the result is stable when I choose 120000 iterations for SSL. I do try to do the same experiments for several times and achieve a similar performance. The results shown in the paper is not the best one but the "average" one I pick up from one experiment. However, if we train the model without SSL, overfitting problem is much more severe. Choosing 80000 iterations is very important.