TobyPDE / FRRN

Full Resolution Residual Networks for Semantic Image Segmentation
MIT License
278 stars 93 forks source link

How to fine tune the model #9

Closed LemonAniLabs closed 7 years ago

LemonAniLabs commented 7 years ago

Hi @TobyPDE

I have use train.py to train a new model, but if I would like to fine tune the pre trained model, how should I do?

Thanks!!

TobyPDE commented 7 years ago

Hi,

We performed finetuning as follows: Switch from ADAM to plain SGD with a very small learning rate (e.g. < 1e-4), perform minority oversampling (i.e. select images that contain rare classes more often. Maybe I can add the corresponding data provider to the repository), slightly decrease the strength of the data augmentation.

That being said, the provided model has already been extensively finetuned and it has been selected using early stopping by looking at the validation error. Hence, I don't think that it's possible to improve the score any further without making some adjustments to the architecture and or the augmentation pipeline.

LemonAniLabs commented 7 years ago

Hi @TobyPDE , Thanks for your reply. I change the ADAM to SGD and reset learning rate, but it was run very long time.... About 4 day and it's not finish

What is your GPU and how many days for training in your experience?

So, can I just change the ADAM to SGD and reset learning rate to enhance the model? Because, I don't want to retrain model from cityscapes training data that you used, just want to use new training data to enhance your model(frrn_b.npz).

Thanks~~

TobyPDE commented 7 years ago

Hi @LemonAniLabs, we trained the model from scratch for about 7 days and fine-tuned it for a futher 2-3 days on a Titan X (Pascal) GPU.

How do you determine that it hasn't finished? Can you maybe send me an email (tobias.pohlen@rwth-aachen.de) with a more detailed description of what you are trying to achieve? Maybe I can give some better assistance then.

Best, Toby

LemonAniLabs commented 7 years ago

Hi @TobyPDE , Nice device~ My device is GTX 1080 8G, so I have to reduce the sample_factor to 4. This model is fine-tuning on iterator 250000..... seems endless. Is the training will be auto finish? I find lot of .npz files in logs/ that size is only 200 bytes How can I find the model snapshot?

Thanks~ LemonAniLabs

TobyPDE commented 7 years ago

Hi, no the training does not terminate automatically. You have to perform early-stopping my watching the validation IoU score in the training monitor. Also, the logging solution has been overhauled and now all logs are written to a single file. If x is your specified model file, then x_snapshot_n.npz is the model snapshot from iteration n.

250000 seems really high. The model usually converges within 40000 - 60000 iterations.

LemonAniLabs commented 7 years ago

Oh, I see.

But in the dltools/hooks.py have a little problem.

I can get the snapshot by replace if len(set(self.tags).intersection(kwargs.keys())) > 0 and kwargs["update_counter"] % self.frequency == 0: to if True and kwargs["update_counter"] % self.frequency == 0:

Where should be setting for before_get_data tag? Thanks

TobyPDE commented 7 years ago

This was a bug that happened when I refactored some code. I fixed it. Sorry for the inconvenience.

LemonAniLabs commented 7 years ago

Cool!! Thanks for your help~~!!!

Probably, I think we can close this issue.