How to use customized loss function in yolo?

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.75k stars 7.96k forks source link

How to use customized loss function in yolo? #923

Open cloudnine148 opened 6 years ago

cloudnine148 commented 6 years ago

I want to classify the orientation of the pedestrian.

So, would you like to change your existing loss function to another formula?

AlexeyAB commented 6 years ago

What exactly is the loss function do you want? Do you want to train classifier or detector that detects different orientation as different objects?

cloudnine148 commented 6 years ago

I want to detect pedestrians using yolo and classify the orientation of the pedestrians. -> In order to orientation classification, i will use small size customize classification network.

Therefore, a separate loss function for orientation classification is required and must be learned. The loss function calculates the angle using the area center 2D coordinate value.

What I'm curious about is the following:

I want to modify the loss function for the orientation classification in the code. ->Are there any things I need to consider when source code editing?
Is the 2stage method of loading the detection and classification networks simultaneously valid? ->I do not detection and classification them at the same time, but i will classification them after detection.

AlexeyAB commented 6 years ago

Classification networks usually use [softmax] and [cost] layers at the end.
- https://github.com/AlexeyAB/darknet/blob/ec766fc3f114dd4faaa3add500776a7bae957ba5/src/softmax_layer.c
- https://github.com/AlexeyAB/darknet/blob/ec766fc3f114dd4faaa3add500776a7bae957ba5/src/cost_layer.c
I think it should work. Darknet can work with several networks on different GPUs simultaneously in one proccess, so there shouldn't be problems.

cloudnine148 commented 6 years ago

hi, Your help was really appreciated. I want to classify pedestrian orientation.

So I used the TUD MULTIVIEW PEDESTRIAN dataset and used the VGG16 network. This dataset contains an 8-way class. So I modified the output of the last connected layer from 1000 to 8 in VGG16.cfg.

I also used cross entropy as a loss function of classification. example) error[i]=truth[i]*(-log(pred[i]));

I made the above changes in the source code that you told me and confirmed that the function works normally.

However, learning with SSE keeps the average loss at 0.8 and using Cross entropy starts at 2.0 and continues to rise.

I added 4732 learning data to 39000, and learned and added. But the result is still bad. Do you have a good idea?