Open Zmjcc opened 4 years ago
You can refer the result in closed issue #6 from @johnjunjun7. I'm trying to train the nano backbone on imagenet but not finished due to other works.
In your previous discussion, I learned that it is better to train the pre-model through IMAGENET and then train on VOC, but I want to use the VOC dataset to roughly test the effect of this model. But when you use your model to train and test with eval.py, all classes have an AP of 0. Don't you know that you have encountered it? Or have you gotten a better pre-model through IMAGENET?
I train directly on VOC2007, and the mAP is about 20,not zero
I had a operation error previously, now I am retraining. After training 180 epoches, I converted the model and evaluated it with eval.py. The mAP was 44.63, but it has not converged yet. I don't know if anyone has trained on Image-Net to get a pre-model?
I pre trained nano backbone network on Imagenet (the current training to top-5 accuracy is 74%)
After adding the pre training weights to the nano, the mAP of trained_final.h5 only about 32 (set the conf ﹣ threshold of eval.py to 0.5). When set the conf-threshold = 0.3, the result of mAP is about 36.
Without using the pre training weight, the result of directly training on VOC is about 16,much worse than you did
Maybe there's something wrong with my training parameters?
my parameters:
--model_type=yolov3-nano
--anchors_path=configs/yolo3_anchors.txt
--model_image_size=416X416
--weights_path=yolo_nano_preweight.h5
--annotation_file='tools/2007_train.txt'
--val_annotation_file='tools/2007_val.txt'
--classes_path='configs/voc_classes.txt'
--batch_size=32
--learning_rate=0.001
--cosine_decay_learning_rate=True
--init_epoch=20
--total_epoch=250
--multiscale=False
other parameters is default.
Is there anything different from you? and what's your parameters of train and eval
For imagenet pretrained YOLO nano backbone, you may need to change the frozen layer number in transfer training stage here to the nanonet layer number, since currently I didn't set it.
I pre trained nano backbone network on Imagenet (the current training to top-5 accuracy is 74%)
After adding the pre training weights to the nano, the mAP of trained_final.h5 only about 32 (set the conf ﹣ threshold of eval.py to 0.5). When set the conf-threshold = 0.3, the result of mAP is about 36.
Without using the pre training weight, the result of directly training on VOC is about 16,much worse than you did
Maybe there's something wrong with my training parameters? my parameters: --model_type=yolov3-nano --anchors_path=configs/yolo3_anchors.txt --model_image_size=416X416 --weights_path=yolo_nano_preweight.h5 --annotation_file='tools/2007_train.txt' --val_annotation_file='tools/2007_val.txt' --classes_path='configs/voc_classes.txt' --batch_size=32 --learning_rate=0.001 --cosine_decay_learning_rate=True --init_epoch=20 --total_epoch=250 --multiscale=False other parameters is default. Is there anything different from you? and what's your parameters of train and eval
对于imagenet预训练的YOLO纳米主干,您可能需要在此处的转移训练阶段将冻结层数更改为nanonet层数,因为当前我没有设置它。
我在Imagenet上预先训练了纳米骨干网(当前对前5位准确性的训练是74%) 将预训练权重添加到nano后,trained_final.h5的mAP仅约为32(将eval.py的conf ﹣阈值设置为0.5)。设置conf-threshold = 0.3时,mAP的结果约为36。 不使用预训练权重,直接对VOC进行训练的结果约为16,比您做的要差得多 我的训练参数可能有问题吗? 我的参数:-- model_type = yolov3-nano --anchors_path = configs / yolo3_anchors.txt --model_image_size = 416X416 --weights_path = yolo_nano_preweight.h5 --annotation_file ='tools / 2007_train.txt'-- val_annotation_file ='tools / 2007_val .txt'-- classes_path ='configs / voc_classes.txt'-- batch_size = 32 --learning_rate = 0.001 --cosine_decay_learning_rate = True --init_epoch = 20 --total_epoch = 250 --multiscale = False 其他参数是默认值。 你和你有什么不同吗?你的训练和评估参数是什么?
Does the number of network layer include BN layer and relu layer? According to the backbone network in your source code, the frozen layer number should be what?
对于imagenet预训练的YOLO纳米主干,您可能需要在此处的转移训练阶段将冻结层数更改为nanonet层数,因为当前我没有设置它。
我在Imagenet上预先训练了纳米骨干网(当前对前5位准确性的训练是74%) 将预训练权重添加到nano后,trained_final.h5的mAP仅约为32(将eval.py的conf ﹣阈值设置为0.5)。设置conf-threshold = 0.3时,mAP的结果约为36。 不使用预训练权重,直接对VOC进行训练的结果约为16,比您做的要差得多 我的训练参数可能有问题吗? 我的参数:-- model_type = yolov3-nano --anchors_path = configs / yolo3_anchors.txt --model_image_size = 416X416 --weights_path = yolo_nano_preweight.h5 --annotation_file ='tools / 2007_train.txt'-- val_annotation_file ='tools / 2007_val .txt'-- classes_path ='configs / voc_classes.txt'-- batch_size = 32 --learning_rate = 0.001 --cosine_decay_learning_rate = True --init_epoch = 20 --total_epoch = 250 --multiscale = False 其他参数是默认值。 你和你有什么不同吗?你的训练和评估参数是什么?
Does the number of network layer include BN layer and relu layer? According to the backbone network in your source code, the frozen layer number should be what?
For current implementation the backbone length should be 269. You can simply check it by printing out "len(model.layers)" in train_imagenet.py and wipe off the tail layers.
I use the following code to deal with the weights trained by Imagenet,
base_model = load_model('/data/b14c757f950445d3ae628f07e2e36a2b/pkgs/pre_final_ep42.h5') resnet_model = Model(inputs=base_model.input, outputs=base_model.get_layer('Conv_pw_3_relu').output)’ print(resnet_model.summary()) resnet_model.save_weights('my_model_weights.h5')
Then, I use this code load weights:
model_body. Load_weights (weights_path, by_name = True)
Maybe those can achieve the same effect?
I use the following code to deal with the weights trained by Imagenet,
base_model = load_model('/data/b14c757f950445d3ae628f07e2e36a2b/pkgs/pre_final_ep42.h5') resnet_model = Model(inputs=base_model.input, outputs=base_model.get_layer('Conv_pw_3_relu').output)’ print(resnet_model.summary()) resnet_model.save_weights('my_model_weights.h5')
Then, I use this code load weights:
model_body. Load_weights (weights_path, by_name = True)
Maybe those can achieve the same effect?
Yes, that's correct for loading the pretrained weights. And for transfer learning a common further practise is freezing the well pretrained part for some epochs to train the random initialized part first, and then free the whole network for fine tune. You can refer related comment here
I use the following code to deal with the weights trained by Imagenet, base_model = load_model('/data/b14c757f950445d3ae628f07e2e36a2b/pkgs/pre_final_ep42.h5') resnet_model = Model(inputs=base_model.input, outputs=base_model.get_layer('Conv_pw_3_relu').output)’ print(resnet_model.summary()) resnet_model.save_weights('my_model_weights.h5') Then, I use this code load weights: model_body. Load_weights (weights_path, by_name = True) Maybe those can achieve the same effect?
Yes, that's correct for loading the pretrained weights. And for transfer learning a common further practise is freezing the well pretrained part for some epochs to train the random initialized part first, and then free the whole network for fine tune. You can refer related comment here
many Thanks, I'm training, Looks like it's going to work a lot better
I am going to try to get a pre-trained model on the coco dataset, but using train.py training, I have trained the following 1000 images and I have the following errors many times:
File "train.py", line 282, in
From error log it seems the image is corrupted. You can try to print out the image file name to check the file content or show it in code with "image.show()"
I am going to try to get a pre-trained model on the coco dataset, but using train.py training, I have trained the following 1000 images and I have the following errors many times: File "train.py", line 282, in _main(args) File "train.py", line 188, in _main callbacks=callbacks) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator steps_name='steps_per_epoch') File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 220, in model_iteration batch_data = _get_next_batch(generator, mode) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 362, in _get_next_batch generator_output = next(generator) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 918, in get six.reraise(sys.exc_info()) File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise raise value File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 894, in get inputs = self.queue.get(block=True).get() File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get raise self._value File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker result = (True, func(args, **kwds)) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 828, in next_sample return six.next(_SHARED_SEQUENCES[uid]) File "/home/undergraduate/folder1/Desktop/keras-YOLOv3-model-set-master/yolo3/data.py", line 276, in yolo3_data_generator image, box = get_random_data(annotation_lines[i], input_shape, random=True) File "/home/undergraduate/folder1/Desktop/keras-YOLOv3-model-set-master/yolo3/data.py", line 87, in get_random_data image = image.resize((nw,nh), Image.BICUBIC) File "/usr/local/lib/python3.5/dist-packages/PIL/Image.py", line 1763, in resize self.load() File "/usr/local/lib/python3.5/dist-packages/PIL/ImageFile.py", line 232, in load "(%d bytes not processed)" % len(b)) OSError: image file is truncated (7 bytes not processed)
How to calculate the FPS of the model in keras? I'd like to see the calculation speed of the model. :)
How to calculate the FPS of the model in keras? I'd like to see the calculation speed of the model. :)
you can use validate_yolo.py. It will run inference for several times and show the average time cost.
Now there is a new calculation method of IOU, which can improve the convergence speed and effect of yolov3. It can be achieved by simply changing the calculation formula of IOU loss function. You can try.
Here is the reference link: https://cloud.tencent.com/developer/article/1558533 https://arxiv.org/pdf/1911.08287.pdf
Now there is a new calculation method of IOU, which can improve the convergence speed and effect of yolov3. It can be achieved by simply changing the calculation formula of IOU loss function. You can try.
Here is the reference link: https://cloud.tencent.com/developer/article/1558533 https://arxiv.org/pdf/1911.08287.pdf
Many thanks. I'm now working on other tasks and will try to pick it up later.
Now there is a new calculation method of IOU, which can improve the convergence speed and effect of yolov3. It can be achieved by simply changing the calculation formula of IOU loss function. You can try.
Here is the reference link: https://cloud.tencent.com/developer/article/1558533 https://arxiv.org/pdf/1911.08287.pdf
Hi @johnjunjun7, I've just draft implemented the DIoU loss & DIoU NMS (with numpy) for YOLOv3 model set and had a try of DIoU NMS on existing pretrained weights. Seems the DIoU NMS could really slightly improve the mAP for all models. Related code has been merged, and I'll move on with the DIoU loss for training. Thanks again for the useful info.
In your previous discussion, I learned that it is better to train the pre-model through IMAGENET and then train on VOC, but I want to use the VOC dataset to roughly test the effect of this model. But when you use your model to train and test with eval.py, all classes have an AP of 0. Don't you know that you have encountered it? Or have you gotten a better pre-model through IMAGENET?