Closed mariembenslama closed 4 years ago
Hi. Long time no reply, haha
Maybe we can get the average of the weights.
Thanx and Long time no see,
My question is:
Hi, different people have different classes, so we need to modify the last lstm layer. We needn’t to store the weights, we just need to load the weights like we normally do and the difference is we should get the average of two models’ weights.
I see, and then, where do we store the weights after that? In the .pth file right? Also will calculating the average drop the performance? What do you think?
I don't know why you want to merge the weights of two models. I think it will drop the performance. If you want to train base on the merged model, then we need't to save, or we can save it. In fact, a a.pth
is just the structure and the weights of the model. So you are right, we can save it as a .pth
file.
I thought I can train in different google colab accounts and then I'll merge the checkpoints to accelerate the learning.
But thanks for the explanation again and again ^_^ !
You have a good idea, but distributed training isn’t just as simple as merging the weights of two models. The performance isn’t 1+1>2, it will more likely to be 1+1<2. If the reason you want to train on two accounts is the time limits, you can use part of the data to train first and then use the rest data and the first stage pretrained model to train the final model.
I see, thanks. But isn't training part of the data = drawing an incomplete curve of some features only?
More data is used to prevent overfitting.
Hello, I wanted to ask if we could merge two .pth files trained on different dataset of this project?
Thanks.