openvinotoolkit / training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
https://openvinotoolkit.github.io/training_extensions/
Apache License 2.0
1.14k stars 442 forks source link

Finetuning a trained model #601

Closed WingRS closed 2 years ago

WingRS commented 3 years ago

Hi! Using the training scripts I was able to train the model and reach around 40% mAP on my dataset. Here is the tensorboard log Selection_096

I have around 50k images and around 5k per class (9 classes in total). The main question is that when I start training from the best-saved model the training kinda goes randomly (just oscilates) and then goes down. Is there a special setting for the finetuning in your setup ? And here is tensorboard from starting training not from snapshot.pth but from my last best model Selection_097

manisoftwartist commented 2 years ago

@morkovka1337 Thanks for all your answers. For the first question, I meant to ask how to infer the learning rate from the .pth file. You got it right. Very helpful to understand my initial results and proceed with my experiments.

manisoftwartist commented 2 years ago

I've already trained without the initial weights. And it works like a charm. The initial setup was the one provided in the repo. The message was more about a possible problem with the initial weights, since they don't match with the architecture, as the PyTorch says.

@WingRS What do you exactly mean by "works like a charm"? Does it converge faster? Have you compared the initial mAP on your validation set between loading the pretrained weights (using --load_weights) and when starting from scratch (NOT using --load_weights and NOT using --resume_from) ?? By initial mAP, I mean values as soon as the network is initialized and before training starts.

@morkovka1337 If we start training from scratch (NOT using either --load_weights and NOT using --resume_from ), how are the weights initialised? randomly?

morkovka1337 commented 2 years ago

@morkovka1337 If we start training from scratch (NOT using either --load_weights and NOT using --resume_from ), how are the weights initialised? randomly?

If there is no values for load_from or resume_from in the config file (arguments like --load-weights just overwrite them), I think, yes. If you want to use pretrained weights for the backbone, you can write this in the config:

init_cfg = dict(type='Pretrained',
            checkpoint='torchvision://resnet50')

See this tutorial for more details.