Doubts/ possible features

tryolabs / luminoth

Deep Learning toolkit for Computer Vision.

BSD 3-Clause "New" or "Revised" License

2.4k stars 400 forks source link

I have some doubts regarding Luminoth as well as the Faster RCNN algorithm.

The first is whether the training on the computer can be done in a similar fashion to that on the cloud, with the possibility of stopping and resuming from checkpoints.

The second is a question about the faster RCNN. The default files have 1000 epochs. How many epochs would be necessary for decent accuracy? In my case, the dataset I use has 70k training examples which might make it unfeasible for running that many epochs. I was wondering if using something like 10 epochs would be enough in a case where the number of training examples is high.

The last question is about whether training can be started from the SSD or Faster RCNN checkpoints. i.e. whether transfer learning can be done by using them as the starting point.

Hello @AshwinAce!

Yes. It works like this by default; whenever you resume training, it will resume from the latest checkpoint available in the job's folder. Try it out! ;)
You will rarely need to hit 1000 epochs. This is set only so training does not go on forever. Normally, you can keep training until the value of your loss stabilizes (ie. does not go down anymore). Ideally, you would use the evaluation script and continue training until your validation set mAP does not improve. See here and here [this is part of a workshop we gave at PyImageConf and will be included in the official documentation soon].
At this point, it is not possible to do fine-tuning from SSD or Faster R-CNN checkpoint. This is planned for the future. The only part of the model that is reused is the base network.

Let us know if you have more questions! I am closing this issue for now, but you may follow up :clinking_glasses:

tryolabs / luminoth

Doubts/ possible features #233