Open KAIIUK opened 6 years ago
@KAIIUK Hi,
What mAP can you get on training dataset and any validation dataset?
it does not work at all on random video (i-leeds for example)
What is "i-leeds"?
The main reason why it may not work properly on another dataset/videos that this rule is not respected: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
General rule - your training dataset should include such a set of relative sizes of objects that you want to detect:
train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width
train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height
3 general reason why neural network may not detect objects:
different scales and aspect ratio - to solve it use: random=1 jitter=0.3
(or jitter=0.45 - but you should train 10 times more iterations)
different colours - to solve it use: exposure=1.5, saturation=1.5, hue=0.1
(in general: exposure>1, saturation>1, hue>0)
different rotation - this kind of data augmentation isn't implemented yet for Yolo
It means - different between training dataset and test-dataset/video.
@AlexeyAB I just now realized that I asked a rather stupid question, because the model is likely to be overfiting, because there are only about 15 examples of the object and about 20k images for each. I also use your advice and see what happens. And another question - what will happen if as a basic network to use yolov3 weight. Спасибо.
I have a model trained for 1 class on 400000 images ... there are only about 15 examples of the object and about 20k images for each
So you have:
But what is the "15 examples"? You shouldn't use identical images.
It is better to use yolov3.conv.81
than yolov3.weights
, because if you will train with yolov3.weights -clear
then all weights starting from this layer will be shifted: https://github.com/AlexeyAB/darknet/blob/2bac3681fcecb225c5e5a6376d7f95835cd8f89e/cfg/yolov3.cfg#L599-L604
because you have different number of classes than in yolov3.cfg
(80).
Just do to get yolov3.conv.81
file:
./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.81 81
then train
./darknet detector train data/obj.data yolo-obj.cfg yolov3.conv.81
or
./darknet detector train data/obj.data yolo-obj.cfg darknet53.conv.74
@AlexeyAB Of course, I meant not clear yolov3.weights, I just wanted to know if conv from v3 is better than darknet53.conv.74. Many thanks)
@KAIIUK
yolov3.conv.81
is better if your objects looks like MS COCO objectdarknet53.conv.74
is better for the remaining cases@AlexeyAB , Could you please educate on how I can figure out to which of the datasets (between COCO objects vs Pascal VOC ) is my images / objects more closely aligned? Is it determined based on the aspect ratio of the images or aspect ratio of objects? Thanks!
Hi Aleksey. I admire the work done by you, but now I have an extremely important question, or rather I ask for advice. I have a model trained for 1 class on 400000 images (both negative and positive ones are included), but basically they are the same. It works excellently on test videos that have roughly the same objects and background, but it does not work at all on random video (i-leeds for example). Give advice on how to improve the results and whether it makes sense to mix with coco for example or voc.