Confidence level decreases when two types of objects are close to each other

LHyang9527 commented 1 year ago

Search before asking

[X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

I trained two types of target detection with Yolov5, with and without mask faces. The dataset ratio is 1:1, and the confidence level of the maskless face drops sharply when the masked face is close to the maskless face in the final detection.

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @LHyang9527, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

glenn-jocher commented 1 year ago

Hi @LHyang9527, with what you are describing, it sounds like the detection of the masked face is having a negative impact on the detection of the unmasked face. Have you tried separate models for each task? You could train one model to detect masked faces and another model to detect unmasked faces. Alternatively, you could try adjusting the object detection thresholds in the final detection or use object tracking algorithms to help the model distinguish between the two objects.

Let me know if you have any other questions!

LHyang9527 commented 1 year ago

We tested using both masked and unmasked face datasets according to 1:1, 2:1, single category, and two categories (masked human eyes + unmasked human eyes), and the phenomenon basically did not improve, as the .pt model file exceeded the 25MB limit and could not be uploaded, the four photos in the compressed file were from number 1 to 4, and the distance between two people was from far and near, and the phenomenon is shown in Fig. The confidence level of the unmasked person is affected when the masked person's eyes are close together.

Hi, @glenn-jocher, Thanks for the reply, We want to be compatible with mask and maskless scenes, and if we separate the two models, we need to know in advance which category the people in the picture belong to, and possibly both categories exist at the same time.

Python3.7.6 torch=1.7.0 cuda10.2

LHyang9527 commented 1 year ago

Any help？

glenn-jocher commented 1 year ago

@LHyang9527, thanks for providing the additional details and images. It looks like the model is having a hard time distinguishing between the masked and unmasked faces when they are close together.

One potential solution to this problem is to use object tracking, which can help the model recognize and track individual objects over time. This approach works by assigning an ID to each object detected in the first frame and then tracking that object throughout subsequent frames. This can help the model differentiate between masked and unmasked faces, even when they are close together.

Another potential solution is to adjust the object detection thresholds for the masked and unmasked classes separately, so that the model is more sensitive to one class or the other depending on the situation. You can experiment with different thresholds to find the settings that work best for your use case.

If these solutions don't work or if you have further questions, feel free to reach out!

LHyang9527 commented 1 year ago

@glenn-jocher Thank you for your reply and the ideas provided, we will try again subsequently. We have recently done some related comparative experiments and the phenomena are as follows, Can you help us analyze it?

yolov5 6.0 M net

Capture

LHyang9527 commented 1 year ago

To revise, the seventh set of experimental data is coco128+Our Eyedata(3966)

glenn-jocher commented 1 year ago

@LHyang9527, thank you for sharing your results. From the graphs you provided, it looks like the model is performing well on the fully masked and fully unmasked faces, but is struggling with the cases where the masked and unmasked faces are close together.

This behavior is expected given the nature of the problem, and it may be difficult to achieve high accuracy in these cases. However, some techniques that may help include adjusting the detection thresholds, using object tracking, or trying different model architectures or training techniques.

In any case, it's important to carefully evaluate the model's performance in various scenarios and continue to fine-tune it with additional data or techniques as needed. Good luck with your research!

ultralytics / yolov5