Loss can't be below 1. - Githubissues

BlueAnthony commented 6 years ago

The loss can not decrease under 1. It will stop and jitter around some number, like 5 to 6 or 12 to 13, when the iteration is around 50. I already try different base learning rate, like 0.001, 0.0001, 0.00001, 0.000001. And the loss start from about 1000. I have 2 classes(Car and pedestrian), 3712 images for training and 3769 images for validation. I use yolov3.weight as pretrained.

Thank you!!

I use the code from pjreddie/darknet and try to fine-tune with yolov3.weight. The command I use is below. "./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg model/yolov3.weights" Yes, I use random=1. My cfg is modified from yolov3.cfg of pjreddie/darknet.

And why I use these learning rate and steps? It's because the yolov3.weights seems to remember the max iteration number, the max_batches for fine-tuning must be larger than 500200 and the fine-tuning just can be start. The loss start about 1000 and stop decreasing about "500200+50" iterations. Do I misunderstand something?

@AlexeyAB Really thank you for your patience.

AlexeyAB commented 6 years ago

Use these params: https://github.com/AlexeyAB/darknet/blob/e29fcb703f8d936e17507bf78043a8b8bc6279b0/cfg/yolov3.cfg#L18-L23

And train about 2000 iterations. If it doesn't help, then something wrong with classes number or with your dataset. Check it using this software: https://github.com/AlexeyAB/Yolo_mark

Do you use random=1 in cfg-file?
Do you use the latest version of this GitHub repository? https://github.com/AlexeyAB/darknet

BlueAnthony commented 6 years ago

@AlexeyAB When I change to use "AlexeyAB/darknet", "./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg model/yolov3.weights" will directly save original model without training

ghost commented 6 years ago

Your learning rate is about 1e-8 which is too small. Try using the option of -clear, then the iteration will restart from 0, if you'd like to use yolov3.weights as pre-trained weights.

BlueAnthony commented 6 years ago

@panda9095 Thank you for your response. Could you tell me more detail about "-clear"? When will "-clear" restart?

ghost commented 6 years ago

@BlueAnthony ./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg model/yolov3.weights -clear

By doing so, the step number will start from 0 instead of 500200. Then you can use @AlexeyAB 's parameters for training.

BlueAnthony commented 6 years ago

@panda9095 Really thank you for your helping! I will try.

AlexeyAB commented 6 years ago

@BlueAnthony Properly commands for training:

./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg model/yolov3.weights -clear
./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg darknet53.conv.74
./darknet detector train cfg/kitti.data cfg/yolov3-kitti.cfg yolov3.conv.105 Pre-trained file yolov3.conv.105 you can get by using this command: ./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.105 105

https://github.com/AlexeyAB/darknet/blob/fb9fcfb3ba0057b1196d1b95fd53cdb0b874b1c3/build/darknet/x64/partial.cmd#L21

AbhimanyuAryan commented 5 years ago

@AlexeyAB does this mean that I can further train my last trained model(on my dataset)....with new data?

AlexeyAB / darknet

Loss can't be below 1. #829