Open sinason opened 4 years ago
I guess you may split a little dataset as a validation set to train the model and use the epoch that achieves the best accuracy (or lowest loss, or when the loss is noticeably decreasing). Then train the whole training set (include the previous validation set) again and test at that epoch. I don't understand the second question, resume model in the training?
The second question is if the training is interrupted for some reason, how can I restart the training using the most recent checkpoint model? As for the loss, due to the GPU, my epoch is small, and loss will not change in 2.x any more. The accuracy cannot be measured on the test set without labels, so I would like to ask about the final approximate loss range of your model in the paper.
I see. I would like to add the resume part in train.py
. One suggestion is to uncomment (use) nn.BatchNorm2d(head_conv)
in ctrbox_net.py
to help stable training.
As for loss, I checked my logs, for either 20 or 48 batch sizes, the loss ranges from 1 to 4. The larger batch size has a smaller loss.
I see. I would like to add the resume part in
train.py
. One suggestion is to uncomment (use)nn.BatchNorm2d(head_conv)
inctrbox_net.py
to help stable training.As for loss, I checked my logs, for either 20 or 48 batch sizes, the loss ranges from 1 to 4. The larger batch size has a smaller loss.
Thanks a lot!
I guess you may split a little dataset as a validation set to train the model and use the epoch that achieves the best accuracy (or lowest loss, or when the loss is noticeably decreasing). Then train the whole training set (include the previous validation set) again and test at that epoch. I don't understand the second question, resume model in the training?