ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.34k stars 16.25k forks source link

box, obj, cls loss #5057

Closed karl-gardner closed 2 years ago

karl-gardner commented 3 years ago

Hello Glenn et. al,

For the box, obj, cls loss given in the output of the training and the results.txt/.png files is this the same as yolov3 losses? If this is the similar to yolov3 is it the same as the coordinate loss, objectness loss, and classification loss:

image

given in the following post: https://towardsdatascience.com/yolo-v3-explained-ff5b850390f

Thanks,

Karl Gardner

glenn-jocher commented 3 years ago

@kgardner330 cls loss is unchanged from YOLOv3, the other two losses are updated slightly in YOLOv5. obj target is updated from 1.0 to CIoU value between target and anchor.

Box regression loss is also updated for stability and complexity improvements.

karl-gardner commented 3 years ago

@glenn-jocher so the cls loss is the same as the original yolo papers: image

while the obj_loss is updated slightly to the CIOU loss which can be found in this review on object detection losses: https://arxiv.org/abs/1911.08287 right?

Thanks,

Karl Gardner

glenn-jocher commented 3 years ago

@kgardner330 yes cls loss is just BCE as in original. Objectness target is equal to CIoU.

karl-gardner commented 3 years ago

Hello @glenn-jocher ,

So the objectness (obj) loss is equal to CIoU (complete IOU) loss. What is the box (coordinate loss) equal to then?

Karl Gardner

glenn-jocher commented 3 years ago

@kgardner330 I would start by reading the first 3 YOLO publications: https://pjreddie.com/publications/

Objectness target is equal to CIoU.

karl-gardner commented 3 years ago

@glenn-jocher ,

If all you say is "the other two losses are updated slightly in YOLOv5" and that the obj loss (objectness loss) is equal to CIoU then how will others know what the slight update is for YOLOv5 without any paper on it? I have been reading the papers actually but that only tells me about the first three versions of YOLO not the new version.

I understand that the obj loss (objectness loss) is equal to CIoU but I was asking about the box_loss which you say is updated slightly.

Karl Gardner

glenn-jocher commented 3 years ago

@kgardner330 loss details are in loss.py: https://github.com/ultralytics/yolov5/blob/070af88108e5675358fd783aae9d91e927717322/utils/loss.py#L131-L137

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

Light-- commented 2 years ago

hi, @glenn-jocher I'm little confused by the code.

In the code you mentioned, why does pxy has nothing to do with anchors? I mean the calculation of pxy does not use anchors information. but only pwh used. May i ask why? Doesn't the calculation of box center need to input anchor information or based on it?

look forward to your reply and thank you!

@kgardner330 loss details are in loss.py:

https://github.com/ultralytics/yolov5/blob/070af88108e5675358fd783aae9d91e927717322/utils/loss.py#L131-L137

glenn-jocher commented 2 years ago

@Light-- this is the loss definition. xy outputs do not involve anchors at all

Light-- commented 2 years ago

hi @glenn-jocher,

Thanks a lot for your quick reply. I have read yolov5 code, So for build_targets in loss.py:

  1. xy is only calculated based on grid center and wh is calculated based on anchor
  2. grid and anchor are totally different, grid is fixed and unique, but anchors are auto-generated and there are many of them
  3. if i want to predict more point coords in the box, what I need to do is only predicting their offsets based on the grid center (neglect any anchor information)? Do you know any yolov5 based good examples for doing this?

am I right? Thank you.

@Light-- this is the loss definition. xy outputs do not involve anchors at all

glenn-jocher commented 2 years ago

@Light-- yes the xy losses are only based on the prediction with respect to the grid. You can add a value to both to get to the image origin but this does nothing to the loss so this step is skipped.

Light-- commented 2 years ago

Hi @glenn-jocher , Thanks for your reply! Your advice is truely important to me. Since the question may be missed, may I ask you about it again?

  1. if I want to predict more point coords (such as contact points between car wheel and road, a car is in the box) in the box, what I need to do is only predicting their offsets based on the grid top-left corner (neglect any anchor information)? Do you know any yolov5 based good examples for doing this?

and one more:

  1. If above 3. is possible, do you think there will be a conflict in model feature learing? The model needs to learns predicting offsets of both contact points and box center at the same time. But their location are different, one is near lower bound of bbox, the other in the center. They are both predicted based on the top-left corner point of same grid.

Look forward to your reply and thanks very much!

@Light-- yes the xy losses are only based on the prediction with respect to the grid. You can add a value to both to get to the image origin but this does nothing to the loss so this step is skipped.

glenn-jocher commented 2 years ago

@Light-- the model can learn anything you want, you just need to have labelled data and the correct model structure (mostly just updates to Detect() module) and loss function, which naturally will require customization on your part in those areas.