Training with a synthetic dataset

euivmar commented 5 years ago

Dear @AlexeyAB

We are training with synthetic data extracted from a renderized photorealistic 3D model (3D motor engine) with random backgrounds (Vocimages) and from multiple scales and view points replicating the work of Hinterstoisser et al 2017. The procedure is syntethized in this image from the paper render .

We have random light and several image augmentations like gaussian noise, flip, rotation... We are working with about 20.000 images for only one class. Do you have any advice to training the yolo network? Is it necessary to freeze yolo layers? How can we do it? Currently, we are testing with: darknet.exe detector train data/obj.data yolo-obj.cfg yolov3.conv.81

AlexeyAB commented 5 years ago

@euivmar Hi,

We are training with synthetic data extracted from a renderized photorealistic 3D model

This is very good approach.

Do you have any advice to training the yolo network? Is it necessary to freeze yolo layers? How can we do it?

No, you shouldn't freeze layers to get the highest accuracy.

Currently, we are testing with: darknet.exe detector train data/obj.data yolo-obj.cfg yolov3.conv.81

This is correct. Just if you want to achieve higher mAP@0.5 then base your cfg-file on yolov3-spp.cfg instead of yolov3.cfg

If you want to achieve higher mAP@0.75 or mAP@0.5...0.95 then base on `yolov3-spp.cfg and add to each of 3[yolo]` layers these 2 lines

iou_normalizer=0.5
iou_loss=giou

Also disable CUDNN_HALF.

in CMake-GUI: ENABLED -> CUDNN_HALF (un-set checkbox)
in MSVS: open \darknet.sln -> (right click on project) -> properties -> C/C++ -> Preprocessor -> Preprocessor Definitions, and remove this: CUDNN_HALF;

euivmar commented 5 years ago

Thanks so much for your quick answer @AlexeyAB! I am going to try your suggestions.

yanxurui commented 5 years ago

@AlexeyAB

If you want to achieve higher mAP@0.75 or mAP@0.5...0.95 then base on `yolov3-spp.cfg and add to each of 3[yolo]` layers these 2 lines
iou_normalizer=0.5
iou_loss=giou

Does a lower iou_normalizer favor high mAP@0.75? Why do I feel it's the opposite because higher iou_normalizer will pay more attention to the bounding box?

Thanks.

AlexeyAB / darknet

Training with a synthetic dataset #3443