GOATmessi8 / RFBNet

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018
MIT License
1.41k stars 355 forks source link

Training datasets with 2 classes (with style of pascal voc) but got an AP=0 #53

Open BestSongEver opened 6 years ago

BestSongEver commented 6 years ago

Thanx for ur excellent work! @ruinmessi I need a little help please. @ruinmessi @yuyijie1995 I am trying to train RFBNet on a dataset with only 2 classes, with style of pascal voc. I changed ~/data/voc0712.py like this:

26 VOC_CLASSES = ( 'background', # always index 0 27 'person') 28 ### 'eroplane', 'bicycle', 'bird', 'boat', 29 ### 'bottle', 'bus', 'car', 'cat', 'chair', 30 ### 'cow', 'diningtable', 'dog', 'horse', 31 ### 'motorbike', 'person', 'pottedplant', 32 ### 'sheep', 'sofa', 'train', 'tvmonitor')

and changed train_RFB.py like this

82 num_classes = (2, 2)[args.dataset == 'COCO'] 83 ###num_classes = (21, 81)[args.dataset == 'COCO']

I run train_RFB.py, but got AP=0.0000 after 120 epoth.

I looked into the loss while training, it did not go well. screenshot I trained my dataset on faster r-cnn once, and got a normal AP of 49%. And this time i set the right path of my data.(Last time i set the wrong path, so i cannot run the train_RFB.py successfully, like @Excalibur0214 said. ) So, mybe there is nothing wrong with my dataset. Should i change more scripts? Could u please tell me how can i fix it ? Thanx again!

left4back commented 6 years ago

You sure you get the dataset right? The error message told you that your dataset annotations may have class 'sofa'.

BestSongEver commented 6 years ago

@Excalibur0214 Could u please help me with my issue? I am new with this and i wanna train the program on my dataset. Thanx!

left4back commented 6 years ago

I've met your former question before, but I never met your present problem. Maybe you should carefully examine your output boxes and draw them on your test pic. This idea is worth to try.

GOATmessi8 commented 6 years ago

@BestSongEver Your location loss is abnormal so I just you to check the annotation of your own data. For training, the box format is [xmin, ymin, xmax, ymax]

BestSongEver commented 6 years ago

@ruinmessi Thanx for ur reply. I made my dataset strictly follow Pascal VOC2007, and it can be trained successfully by other program. Still and all i tried another Pascal VOC-like dataset with only one class ( bird ). But the code still didn't work .

I finetuned NUM_CLASSES and num_classes in data/voc0712.py; models/*.py; train_RFB.py and test_RFB.py.

I just found that Location loss and Class loss went well at the begin. But after a few of Eposh(maybe 4 or 7), Class loss suddently dropped from 2. to 0.0 abnormally. Then Location loss would maintain higher then 1.0000 and never drop again. While Class loss would maintain lower then 0.1

For my person dataset:

Epoch:6 || epochiter: 735/769|| Totel iter 4580 || L: 1.9189 C: 2.5317||Batch time: 1.3623 sec. ||LR: 0.00400000 Epoch:6 || epochiter: 745/769|| Totel iter 4590 || L: 1.7877 C: 2.2604||Batch time: 1.3614 sec. ||LR: 0.00400000 Epoch:6 || epochiter: 755/769|| Totel iter 4600 || L: 1.5482 C: 1.7651||Batch time: 1.3544 sec. ||LR: 0.00400000 Epoch:6 || epochiter: 765/769|| Totel iter 4610 || L: 1.3797 C: 2.2506||Batch time: 1.3631 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 6/769|| Totel iter 4620 || L: 1.0155 C: 0.9434||Batch time: 1.3564 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 16/769|| Totel iter 4630 || L: 1.5007 C: 0.2787||Batch time: 1.3590 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 26/769|| Totel iter 4640 || L: 1.5300 C: 0.2775||Batch time: 1.3682 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 36/769|| Totel iter 4650 || L: 1.8310 C: 0.5066||Batch time: 1.3597 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 46/769|| Totel iter 4660 || L: 1.5827 C: 0.1721||Batch time: 1.3624 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 56/769|| Totel iter 4670 || L: 2.2631 C: 0.0913||Batch time: 1.3968 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 66/769|| Totel iter 4680 || L: 1.2228 C: 0.0428||Batch time: 1.3537 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 76/769|| Totel iter 4690 || L: 1.5303 C: 0.0263||Batch time: 1.3642 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 86/769|| Totel iter 4700 || L: 2.2687 C: 0.0883||Batch time: 1.3423 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 96/769|| Totel iter 4710 || L: 1.1470 C: 0.0194||Batch time: 1.3320 sec. ||LR: 0.00400000 Epoch:7 || epochiter: 106/769|| Totel iter 4720 || L: 1.8459 C: 0.0782||Batch time: 1.3386 sec. ||LR: 0.00400000

For a VOC-like bird dataset:

Epoch:3 || epochiter: 12/44|| Totel iter 100 || L: 3.3108 C: 3.3798||Batch time: 1.2181 sec. ||LR: 0.00181873 Epoch:3 || epochiter: 22/44|| Totel iter 110 || L: 3.2515 C: 3.1052||Batch time: 1.2168 sec. ||LR: 0.00200050 Epoch:3 || epochiter: 32/44|| Totel iter 120 || L: 2.8687 C: 3.1402||Batch time: 3.3242 sec. ||LR: 0.00218227 Epoch:3 || epochiter: 42/44|| Totel iter 130 || L: 2.6957 C: 3.2901||Batch time: 1.2092 sec. ||LR: 0.00236405 Epoch:4 || epochiter: 8/44|| Totel iter 140 || L: 2.9657 C: 2.9876||Batch time: 30.9130 sec. ||LR: 0.00254582 Epoch:4 || epochiter: 18/44|| Totel iter 150 || L: 3.1143 C: 0.8904||Batch time: 1.2268 sec. ||LR: 0.00272759 Epoch:4 || epochiter: 28/44|| Totel iter 160 || L: 2.9572 C: 0.1551||Batch time: 2.3035 sec. ||LR: 0.00290936 Epoch:4 || epochiter: 38/44|| Totel iter 170 || L: 3.0392 C: 0.1138||Batch time: 1.2104 sec. ||LR: 0.00309114 Epoch:5 || epochiter: 4/44|| Totel iter 180 || L: 2.9200 C: 45.1219||Batch time: 1.2046 sec. ||LR: 0.00327291 Epoch:5 || epochiter: 14/44|| Totel iter 190 || L: 5.1481 C: 2.3512||Batch time: 1.2625 sec. ||LR: 0.00345468 Epoch:5 || epochiter: 24/44|| Totel iter 200 || L: 3.3059 C: 0.1716||Batch time: 27.4927 sec. ||LR: 0.00363645 Epoch:5 || epochiter: 34/44|| Totel iter 210 || L: 3.5403 C: 0.0787||Batch time: 27.3811 sec. ||LR: 0.00381823 Epoch:6 || epochiter: 0/44|| Totel iter 220 || L: 3.3519 C: 0.0618||Batch time: 58.6951 sec. ||LR: 0.00400000 Epoch:6 || epochiter: 10/44|| Totel iter 230 || L: 3.8552 C: 0.0889||Batch time: 1.2170 sec. ||LR: 0.00400000

I am still confused. Please help me if u can. Thanx again.

GOATmessi8 commented 6 years ago

@BestSongEver you may try a small lr like 0.001 or 0.0005 in your case.

nvlong21 commented 6 years ago

I trained with the first 2 classes as well as you. Then I made the addition of the background class [0,0,0,0,0] and the result was really good. instead of np.empty ((0.5)) I modify to np.zeros ((1,5)) or you can add some images without object in class. And their labels are [0,0,0,0,0].

nvlong21 commented 6 years ago

loc_loss from 3.x=> 0.11x. conf_loss from 9.x => 0.5x batc_size: 32, lr 0.0023.

nvlong21 commented 6 years ago

I change with other images: anno_id = self.ids[index] if anno_id in self.ids_none_object: res = np.empty((0,5)) bndbox = [0,0,0,0,0] target = np.vstack((res,bndbox))

nvlong21 commented 6 years ago

After 200 epoch with 640 images/epoch image

BestSongEver commented 6 years ago

Sounds good! @vanlong96tg Could you please send me your revised scripts? E-mail : 475527783@qq.com I will give a shot with my dataset. If it works very well with my data, i will recommend your anwer as the final solution. Thank you again

nvlong21 commented 6 years ago

You can see here: https://github.com/vanlong96tg/License_plate_detection

nvlong21 commented 6 years ago

I only change the class VOCDetection and AnnotationTransform. You can see the json file in the data that I provided.

BestSongEver commented 6 years ago

Thanks for your reply. @vanlong96tg Since my dataset is organized Pascal_VOC-liked, i'd like to try your scripts later if i can successfully turn my dataset to your style.

BestSongEver commented 6 years ago

I noticed that your dataset is small, with about 600 images annotated. I tried original scripts on a similar small dataset, it also worked well. @vanlong96tg But original scripts work bad with my big dataset (about 28,000 images annotated). With a lower lr, it can trian now, but work bad until now. Do you believe your good result was caused by your small-scale dataset? or your code-changing? Thanx.

nvlong21 commented 6 years ago

may be! But it's worth trying.

nvlong21 commented 6 years ago

As can be seen, one class in VOC_data is not really large.

lucasjinreal commented 6 years ago

@vanlong96tg Have u train RFBNet with mobilenet? I trained about 2 days with 30 epochs, and the result got nothing, all the biggest probabilty are all background... can not detect box at all

dodogoffy commented 5 years ago

@BestSongEver I meet the same problem as you. How do you solve it?

LLuqw commented 5 years ago

@dodogoffy I also meet this problem. I modify the voc0712.py to my own dataset. In the AnnotationTransform class,it do lower() to the ['name'].text,but in voc_eval.py don't. For example,you define your OWN_CLASSES = ( 'background', 'class'),but in the xml file, your valuename is Class. Although they are different, you can train successfully but test failed. So you can try : R = [obj for obj in recs[imagename] if obj['name'].lower() == classname]

dodogoffy commented 5 years ago

@dodogoffy I also meet this problem. I modify the voc0712.py to my own dataset. In the AnnotationTransform class,it do lower() to the ['name'].text,but in voc_eval.py don't. For example,you define your OWN_CLASSES = ( 'background', 'class'),but in the xml file, your valuename is Class. Although they are different, you can train successfully but test failed. So you can try : R = [obj for obj in recs[imagename] if obj['name'].lower() == classname]

Thanks a lot! It works now!