why use BCE loss instead of CCE loss for multi-class detection

captainst commented 2 years ago

❔Question

I see from the code in ComputeLoss that BCE (binary cross entropy) is used to calculate the cls and obj loss. For multi-class application, it seems that loss for multiple classes are broken down into multiple single class BCE. It confuses me. Shouldn't we use the CCE (categorical cross entropy) instead of BCE for multi-class application ?

Many thanks !

github-actions[bot] commented 2 years ago

👋 Hello @captainst, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 2 years ago

@captainst BCE loss provides better results and probably safer Classification balancing with Objectness which is also BCE. It also allows for YOLOv5 to be used in multi_label applications, i.e. 'flower' and 'rose'.

glenn-jocher commented 2 years ago

@captainst another point though is that this is an empirical science, so if you can produce better results with CE, let us know.

captainst commented 2 years ago

@glenn-jocher Thank you for your instant reply (that's fast!) If I understood correctly, the BCE is designed also for multi-label apps, for example, "blue" and "shirt" for the same proposed box. If, otherwise, in an application where each proposed box should be assigned to only on label (mutually exclusive), for example, bicycle, or people. In this case the CCE can also be used. Correct ?

glenn-jocher commented 2 years ago

@captainst yes that's right! CE can also be used with YOLOv5, though it would also require a rebalancing of the loss terms or an adjustment of the cls loss gain: https://github.com/ultralytics/yolov5/blob/ed887b5976d94dc61fa3f7e8e07170623dc7d6ee/data/hyps/hyp.scratch.yaml#L14

captainst commented 2 years ago

@glenn-jocher many thanks again! I'll take a look at the yaml.

eecn commented 1 year ago

yolov5 uses BCEloss to achieve multi-classification including background classes (when all classes are less than the confidence value, such as 0.5, it is the background class), Why do you calculate obj_conf separately to judge whether to include samples or not? This confuses me.

glenn-jocher commented 1 year ago

@eecn dual obj and cls losses and outputs inherited from Joseph Redmon's original YOLOv3

sujit-ag commented 1 year ago

I've used CE loss instead of BCE with logits loss and saw huge performance improvement for my usecase. New set of experiments are in progress for training with higher gain value for ce loss and CE with focal loss.

glenn-jocher commented 11 months ago

@sujit-ag that's fantastic to hear! It's great to see you experimenting with different loss functions and observing performance improvements. Your feedback is valuable for the YOLO community. Keep up the great work!

ultralytics / yolov5