Training on BDD100K Dataset

RollingIsland commented 4 years ago

Hi @AlexeyAB , I'm training yolov3 on bdd100k, now I have trained for 27200 iterations, but the loss seems can't get any lower, which has been around 5 for almost 10000 iterations, the mAP also grow slowly: chart_yolov3-bdd100k it also works poorly: I saw that you have offered a Gaussian yolov3 cfg file with BDD suffix:

Is that better than yolov3 on this dataset?
Have you trained on bdd100k dataset before?
If you did, would you please share the weights file or give some advice?

VolkovAK commented 4 years ago

Hello! I'm training yolov3 right now on part of this dataset (only cars). I'm pretty sure that yolov3 can handle this dataset with adequate quality. 1) Did you change yolov3.cfg in some way? If so, can you please share changes? 2) How did you parse bdd100k labels? Yolo use labels center_x, center_y, width, height; and bdd100k have x1, y1, x2, y2.

AlexeyAB commented 4 years ago

Train by using yolov3-spp.cfg

Or try to train by using:

cfg-file: https://drive.google.com/open?id=15WhN7W8UZo7-4a0iLkx11Z7_sDVHU4l1
pre-trained weights: https://drive.google.com/open?id=1ULnPnamS5A6lOgidlBXD24IdxoDAFaaV

RollingIsland commented 4 years ago

Thanks for your reply! @VolkovAK

Did you change yolov3.cfg in some way? If so, can you please share changes?

Yes, I did all the changes mentioned in How to train (to detect your custom objects), I also changed the size of anchor boxes by the command mentioned in How to improve object detection.

How did you parse bdd100k labels? Yolo use labels center_x, center_y, width, height; and bdd100k have x1, y1, x2, y2.

I'm sure I parse the label correctly, here is the code about what you mentioned:

box_x_min = int(member[4][0].text) 
box_y_min = int(member[4][1].text)  
box_x_max = int(member[4][2].text)  
box_y_max = int(member[4][3].text) 
x_center = (box_x_min + box_x_max) / (2 * picture_width)
y_center = (box_y_min + box_y_max) / (2 * picture_height)
width = (box_x_max - box_x_min) / picture_width
height = (box_y_max - box_y_min) / picture_height

@AlexeyAB I meet a problem when I try to use the file you offered:

./darknet detector train bdd100k/bdd100k.data bdd100k/cd53paspp-gamma.cfg bdd100k/cd53paspp-gamma_final.weights -map

%M X)6W968}PG6ENZ R4MN

It seems the weights you offered contains the iterations that has been trained, so shall I just increase the number of max_batches in cfg file to continue?
What can I do if I want to see the loss chart? (It is strange to me as I only changed the cfg file and weights file.)

VolkovAK commented 4 years ago

@RollingIsland Yes, parsing code is OK.

About your questions: 1) Error on screenshot tells that something with Qt and X server is wrong. It looks like you use darknet in docker, and maybe you forgot to make X-server available for apps inside docker. Anyway, in my opinion you don't really need to watch loss chart online, since flag "-map" saves this chart in directory of darknet. So to pass this error you need to add flag "-dont_show". 2) Yes, you can increase number of max_batches, but I think that this wouldn't be enough. Given cfg file (and i suppose weights too) have 80 classes, while bdd100k have much less. This means that you can't use it for bdd100k "as is". At first you should take part of it, which wouldn't change when you set correct number of filters in layer berfore [yolo]. For this purpose I often cut right before first yolo layer - 1, so here it will be 113 (or 110 for sure, doesn't really matter). ./darknet partial bdd100k/cd53paspp-gamma.cfg bdd100k/cd53paspp-gamma_final.weights bdd100k/cd53paspp-gamma.113 113 at the end is not an extension, but just reminder how much layers it has. And than train with this "weights" as a start point. In this case you don't even need to increase number of iterations.

Oh, one more thing. Of course, it's up to you, but in my experience custom anchors for particular dataset gives nothing good. I also tried this and got some strange predictions like traffic sign in your image (two rectangular anchors makes "cross"). Maybe we should also change mask field in [yolo] for matching size, like "all anchors <30 for 1 layer, >30 and <60 for second" and so on, i'm not sure for this. By the way, default anchors almost always are good.

Hope it will help!

RollingIsland commented 4 years ago

@VolkovAK Thanks for your detailed reply!

At first you should take part of it, which wouldn't change when you set correct number of filters in layer berfore [yolo]. For this purpose I often cut right before first yolo layer - 1, so here it will be 113 (or 110 for sure, doesn't really matter).

I don't quiet understand. I changed the number of filters in layer berfore [yolo] as How to train (to detect your custom objects) mentioned, which equals 45 for bdd100k according to the formula (classes + 5)*3, then I changed classes in yolo layer to 10. Is that not enough?

but in my experience custom anchors for particular dataset gives nothing good

Do you mean it won't make result better, or it makes result even worse? Do I need to change back?

VolkovAK commented 4 years ago

@RollingIsland

I don't quiet understand. I changed the number of filters in layer berfore [yolo] as How to train (to detect your custom objects) mentioned, which equals 45 for bdd100k according to the formula (classes + 5)*3, then I changed classes in yolo layer to 10. Is that not enough?

For cfg it's enough, but weights file is trained with 80 classes (255 filters). I'm not sure darknet can handle this situation, when number of filters in cfg and in weights doesn't match (I don't check it, so i may be wrong!). And you always can train from scratch, just longer.

Do you mean it won't make result better, or it makes result even worse? Do I need to change back?

I think it's kind of last method for increasing quality. I'm not insist on changing anchors back, but i suggest you to try recalculating later, when results will be acceptable. Not sure, will it make better or worse, but for sure it's not necessary.

RollingIsland commented 4 years ago

@VolkovAK OK, I understand. I will try your advice. Thanks a lot!

AlexeyAB commented 4 years ago

/darknet detector train bdd100k/bdd100k.data bdd100k/cd53paspp-gamma.cfg bdd100k/cd53paspp-gamma_final.weights -map %M X)6W968}PG6ENZ R4MN

It seems the weights you offered contains the iterations that has been trained, so shall I just increase the number of max_batches in cfg file to continue?

Run with -clear flag

/darknet detector train bdd100k/bdd100k.data bdd100k/cd53paspp-gamma.cfg bdd100k/cd53paspp-gamma_final.weights -map -clear

Also change cfg-file as described here: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

KKDDD commented 2 years ago

@VolkovAK Hello, I saw that you tested on the BDD（only cars） data set. I don’t know if you can share the accuracy of your training at that time. I am currently training yolo-v4-tiny on the BDD（only cars）data set, but the accuracy is very low.

AlexeyAB / darknet

Training on BDD100K Dataset #5127