WongKinYiu / CrossStagePartialNetworks

Cross Stage Partial Networks
https://github.com/WongKinYiu/CrossStagePartialNetworks
899 stars 173 forks source link

how to edit cfg file. #14

Closed quocnhat closed 4 years ago

quocnhat commented 4 years ago

hello, I am trying to build a new small & efficient model by concatenating (backbone) csmobilenetv2 + (head) yolov3_tiny to detect objects. Everything going well until ~ 1000 iters (avg loss ~ 8.0) but it gets a nan after that. Does anyone have any ideas? Thank you. here is my layer information. batch = 1, time_steps = 1, train = 0 layer filters size/strd(dil) input output 0 conv 32 3 x 3/ 2 224 x 224 x 3 -> 112 x 112 x 32 0.022 BF 1 conv 16/ 16 3 x 3/ 1 112 x 112 x 32 -> 112 x 112 x 16 0.007 BF 2 route 0 -> 112 x 112 x 32 3 conv 32 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 32 0.026 BF 4 conv 32/ 32 3 x 3/ 1 112 x 112 x 32 -> 112 x 112 x 32 0.007 BF 5 conv 16 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 16 0.013 BF 6 route 5 1 -> 112 x 112 x 32 7 conv 48 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 48 0.039 BF 8 conv 48/ 48 3 x 3/ 2 112 x 112 x 48 -> 56 x 56 x 48 0.003 BF 9 route 7 -> 112 x 112 x 48 10 conv 96/ 48 3 x 3/ 2 112 x 112 x 48 -> 56 x 56 x 96 0.005 BF 11 conv 24 1 x 1/ 1 56 x 56 x 96 -> 56 x 56 x 24 0.014 BF 12 conv 72 1 x 1/ 1 56 x 56 x 24 -> 56 x 56 x 72 0.011 BF 13 conv 144/ 72 3 x 3/ 1 56 x 56 x 72 -> 56 x 56 x 144 0.008 BF 14 conv 24 1 x 1/ 1 56 x 56 x 144 -> 56 x 56 x 24 0.022 BF 15 Shortcut Layer: 11, wt = 0, wn = 0, outputs: 56 x 56 x 24 0.000 BF 16 route 15 8 -> 56 x 56 x 72 17 conv 72 1 x 1/ 1 56 x 56 x 72 -> 56 x 56 x 72 0.033 BF 18 conv 72/ 72 3 x 3/ 2 56 x 56 x 72 -> 28 x 28 x 72 0.001 BF 19 route 17 -> 56 x 56 x 72 20 conv 144/ 72 3 x 3/ 2 56 x 56 x 72 -> 28 x 28 x 144 0.002 BF 21 conv 32 1 x 1/ 1 28 x 28 x 144 -> 28 x 28 x 32 0.007 BF 22 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF 23 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF 24 conv 32 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 32 0.010 BF 25 Shortcut Layer: 21, wt = 0, wn = 0, outputs: 28 x 28 x 32 0.000 BF 26 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF 27 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF 28 conv 32 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 32 0.010 BF 29 Shortcut Layer: 25, wt = 0, wn = 0, outputs: 28 x 28 x 32 0.000 BF 30 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF 31 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF 32 conv 64 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 64 0.019 BF 33 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF 34 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF 35 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF 36 Shortcut Layer: 32, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF 37 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF 38 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF 39 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF 40 Shortcut Layer: 36, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF 41 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF 42 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF 43 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF 44 Shortcut Layer: 40, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF 45 route 44 18 -> 28 x 28 x 136 46 conv 192 1 x 1/ 1 28 x 28 x 136 -> 28 x 28 x 192 0.041 BF 47 conv 192/ 192 3 x 3/ 2 28 x 28 x 192 -> 14 x 14 x 192 0.001 BF 48 route 46 -> 28 x 28 x 192 49 conv 384/ 192 3 x 3/ 2 28 x 28 x 192 -> 14 x 14 x 384 0.001 BF 50 conv 96 1 x 1/ 1 14 x 14 x 384 -> 14 x 14 x 96 0.014 BF 51 conv 288 1 x 1/ 1 14 x 14 x 96 -> 14 x 14 x 288 0.011 BF 52 conv 576/ 288 3 x 3/ 1 14 x 14 x 288 -> 14 x 14 x 576 0.002 BF 53 conv 96 1 x 1/ 1 14 x 14 x 576 -> 14 x 14 x 96 0.022 BF 54 Shortcut Layer: 50, wt = 0, wn = 0, outputs: 14 x 14 x 96 0.000 BF 55 conv 288 1 x 1/ 1 14 x 14 x 96 -> 14 x 14 x 288 0.011 BF 56 conv 576/ 288 3 x 3/ 1 14 x 14 x 288 -> 14 x 14 x 576 0.002 BF 57 conv 96 1 x 1/ 1 14 x 14 x 576 -> 14 x 14 x 96 0.022 BF 58 Shortcut Layer: 54, wt = 0, wn = 0, outputs: 14 x 14 x 96 0.000 BF 59 route 58 47 -> 14 x 14 x 288 60 conv 288 1 x 1/ 1 14 x 14 x 288 -> 14 x 14 x 288 0.033 BF 61 conv 288/ 288 3 x 3/ 2 14 x 14 x 288 -> 7 x 7 x 288 0.000 BF 62 route 60 -> 14 x 14 x 288 63 conv 576/ 288 3 x 3/ 2 14 x 14 x 288 -> 7 x 7 x 576 0.001 BF 64 conv 160 1 x 1/ 1 7 x 7 x 576 -> 7 x 7 x 160 0.009 BF 65 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF 66 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF 67 conv 160 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 160 0.015 BF 68 Shortcut Layer: 64, wt = 0, wn = 0, outputs: 7 x 7 x 160 0.000 BF 69 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF 70 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF 71 conv 160 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 160 0.015 BF 72 Shortcut Layer: 68, wt = 0, wn = 0, outputs: 7 x 7 x 160 0.000 BF 73 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF 74 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF 75 conv 320 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 320 0.030 BF 76 route 75 61 -> 7 x 7 x 608 77 conv 640 1 x 1/ 1 7 x 7 x 608 -> 7 x 7 x 640 0.038 BF 78 conv 27 1 x 1/ 1 7 x 7 x 640 -> 7 x 7 x 27 0.002 BF 79 yolo [yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00 80 route 75 -> 7 x 7 x 320 81 conv 128 1 x 1/ 1 7 x 7 x 320 -> 7 x 7 x 128 0.004 BF 82 upsample 2x 7 x 7 x 128 -> 14 x 14 x 128 83 route 82 60 -> 14 x 14 x 416 84 conv 256 3 x 3/ 1 14 x 14 x 416 -> 14 x 14 x 256 0.376 BF 85 conv 27 1 x 1/ 1 14 x 14 x 256 -> 14 x 14 x 27 0.003 BF 86 yolo

WongKinYiu commented 4 years ago

hello, could you provide the cfg and training log? do you want use 224x224 as input? do you modify your anchors to match your data?

you could find what parameters you should change before training in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

quocnhat commented 4 years ago

Hi @WongKinYiu, Thank you for your fast reply.

csmobilenet-v2.txt

WongKinYiu commented 4 years ago

Hello @quocnhat

you used the parameters of classifier in [net] block, just change it to parameters of detector.

[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=8
subdivisions=2
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 800000
policy=steps
steps=640000,720000
scales=.1,.1
quocnhat commented 4 years ago

great. I will try and give a feedback soon.

quocnhat commented 4 years ago

We can just train without weight pretrained (cspmobilenetv2_last.weights), but the results (detection) is worse than yolov3 tiny. Otherwise, we get avg_losses nan.

WongKinYiu commented 4 years ago

Hello, could you provide cfg files of both cspmobilenetv2 and yolov3 tiny? And if train without weight pretrained, you need more burn_in and max_epochs.

by the way, if get avg_losses nan, the results must very bad.

quocnhat commented 4 years ago

Sure, we can not evaluate with avg_losses nan. the result I compared above is at epoch 20 with no weight pretrained ( avg_losses ~ 5.0). I will keep training longer than usual cuz our model is trained from the beginning (no pretrained). here is my cfg files. thank you. csmobilenet-v2.txt yolov3-tiny.txt

WongKinYiu commented 4 years ago

epoch 20

Do you use darknet or pytorch?

quocnhat commented 4 years ago

I used ./darknet to train. follow AlexeyAB repohttps://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

WongKinYiu commented 4 years ago

i modify the csmobilenet-v2.txt follow your yolov3-tiny.txt.

if gpu ram is not enough, increase the subdivisions.

and please provide your training command with pre-trained model, thanks.

quocnhat commented 4 years ago

thank you for your help, I will try it now. This training command will get nan if we use pre-trained model: ./darknet detector train csmobilenet-v2.data cfg/csmobilenet-v2.cfg weights/csmobilenet-v2_final.weights -clear 1 This training command is ok but the detection result is not really good. ./darknet detector train csmobilenet-v2.data cfg/csmobilenet-v2.cfg

WongKinYiu commented 4 years ago

yes, because you need do partial first. https://github.com/AlexeyAB/darknet#how-to-train-tiny-yolo-to-detect-your-custom-objects

for csmobilenet-v2, you should use darknet.exe partial csmobilenet-v2.cfg csmobilenet-v2_final.weights csmobilenet-v2.conv.77 77 to get your pre-trained model csmobilenet-v2.conv.77

quocnhat commented 4 years ago

Oh yes, I got it. thank you so much for your time.

H-YunHui commented 4 years ago

@quocnhat When you used csmobilenetv2(backbone) and yolov3_tiny(head) to detect objects, how effective is the detection? Have you tested it on VOC or COCO datasets, how about the accuracy and speed?

quocnhat commented 4 years ago

@H-YunHui I tested it on my custom dataset for vehicle detection ( BDD + COCO + VOC + MIO,..) and accuracy seems not better than yolov3-tiny (in my case).

H-YunHui commented 4 years ago

@quocnhat You may be a bit wrong when training. I have used mobilenetV2+yolov3-tiny more than ten points than yolov3-tiny on my custom dataset for person detection(COCO + VOC). I haven't tried csmobilenetv2+yolov3_tiny, if I have time. I will try it.

quocnhat commented 4 years ago

yes please give a try for a more accurate answer. I commented "seems not good" above because the training loss did not convert as I expected ( about ~4.0), Also, I tested on the image, the bounding box prediction is not fit ideally to the object. then I stop training.

H-YunHui commented 4 years ago

@quocnhat How much FPS can you reach when you test? On which device?

quocnhat commented 4 years ago

I care both the accuracy and the speed. So when the result is not like my expectation, I did not find the FPS. My device is GTX_1080_TI.

quocnhat commented 4 years ago

@H-YunHui as the reported result from you. I want to test on mobilenetv2+tinyv3 for vehicle detection. I have a question how to partial pretrained weights after downloading the pretrained weights from website? like this step https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/14#issuecomment-591796365 thank you

Damon0626 commented 4 years ago

i modify the csmobilenet-v2.txt follow your yolov3-tiny.txt.

if gpu ram is not enough, increase the subdivisions.

and please provide your training command with pre-trained model, thanks.

Dear WongKinYiu, I compared the csmobilenet-v2.cfg you modifeid with quocnhat's. You modified some parames that I could understand, but at some placese you added some convs layers and set different params that i don't know why. Yow know Gives the human by the fish him to fish. So can you show the rules of modify the cfg files like yolov3 final filters is (5+classes)x3, etc. Thanks very much. image image

quocnhat commented 4 years ago

hello @Damon0626, As my understanding, the "route" aforementioned is the concatenating step (FPN method). see snip shot below: git line 70: route 69 48 ( in cfg file: [route] layers = -1, 48) current line/layers 70, then layers -1 means 69. the output size of layer 69 is (26x26). We need to find the layer that has the same output size ( layer 48) in order to concatenate properly. Hope this help

Damon0626 commented 4 years ago

@quocnhat Thank you very much. Now I know the "route" layers function. Before you answered, I uesed your cfg file only changed the [net] layer params, filters number and classes for my own dataset. Now the pc has been iterating 4000 times of 12000.

quocnhat commented 4 years ago

After testing on my own dataset, I find that combining mobilenet_v2 as a backbone and tiny v3 as a det-head getting a better detection result. below is a config file if you like. mobilenetv2_tinyv3.txt

LukeAI commented 4 years ago

@quocnhat interesting! Do you have any comparative results to share? What was the AP and FPS of mobilenetv2_tinyv3 vs. tiny-yolov3 on your dataset?

quocnhat commented 4 years ago

sorry, it still has been not calculated AP yet. I visualize that the total training loss of mobilenetv2 is much less than csmobilenetv2 (about 2.2 compare to 4.x). please test if you need any comparison

fpshuang commented 4 years ago

I rewrite the cspmobilenet-yolo3 using tf2. And I get the quite similar result. cspmobilenet's training loss is twice of the loss of mobilenet-yolo. And the mAP is much worse too.

I wander if you guys have the similar results too.

quocnhat commented 4 years ago

I rewrite the cspmobilenet-yolo3 using tf2. And I get the quite similar result. cspmobilenet's training loss is twice of the loss of mobilenet-yolo. And the mAP is much worse too.

I wander if you guys have the similar results too.

same training loss as mine. But I did not check mAP