Open SkalskiP opened 5 months ago
Thanks for the fantastic demo and detailed evaluation! We previously posted a comment on your demo page to provide some clarification. https://huggingface.co/spaces/SkalskiP/YOLO-ARENA/discussions/1. Thanks!
Same issue in my model, can we freely set a smaller threshold for YOLOv10 to detect more small object?
Hi @jameslahm 👋🏻 If I understand your comment correctly, the differences are due to:
- Loss of accuracy resulting from conversion to ONNX (is this expected)?
@SkalskiP Thanks. We try to simulate the loss of accuracy in our local environment. We infer the image using the same onnx conversion following the below process. But the result is still different from that of the demo. So we are not very sure where the reason lies and want to seek your help.
wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/f878616a10625ce7dba02bcb34df2df279273666/image.png
yolo export model=yolov10m.pt format=onnx opset=13 simplify half=True device=0
yolo predict model=yolov10m.onnx source=vehicles.png half conf=0.4
- Different optimal confidence thresholds?
Yes, we think so.
Same issue in my model, can we freely set a smaller threshold for YOLOv10 to detect more small object?
@pseacrest Yes, we think so.
The ONNX models that we are running were converted by our ML team. I'll try to understand how they did it and get back to you.
@SkalskiP ONNX Models were converted using the instructions in the README.md..
The steps below were followed for n/s/m/b/l/x pt files:
yolo export model=yolov10n.pt format=onnx opset=13 simplify
@NickHerrig Thanks! Would you mind checking if the following results in your local environment match that of the demo? Thank you!
- Loss of accuracy resulting from conversion to ONNX (is this expected)?
@SkalskiP Thanks. We try to simulate the loss of accuracy in our local environment. We infer the image using the same onnx conversion following the below process. But the result is still different from that of the demo. So we are not very sure where the reason lies and want to seek your help.
wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/f878616a10625ce7dba02bcb34df2df279273666/image.png yolo export model=yolov10m.pt format=onnx opset=13 simplify half=True device=0 yolo predict model=yolov10m.onnx source=vehicles.png half conf=0.4
Hi @jameslahm 👋🏻
I just updated https://huggingface.co/spaces/SkalskiP/YOLO-ARENA. Now we load images in Pillow
. And the results are slightly different.
I also added per-model confidence threshold sliders.
@SkalskiP Thank you very much! The results of the demo seem to be still different from our local environment. We are investigating this. We will get back to you once we identify the root cause.
@SkalskiP Hello, you have designed a Simple and elegant interface is it opensource? btw I Just checked your github page and its very impressive I loved the Neural networks Numpy example
@SkalskiP @NickHerrig We found that the inference results seem to be not the same in our codebase and Roboflow Inference with the same onnx file. Here is a minimal example for reproducing this issue.
wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/56eee51b0a661453cbf915229dfbadc00b7a0cad/vehicles.png
pip install -q git+https://github.com/THU-MIG/yolov10.git
import numpy as np
import supervision as sv
from inference import get_model
from PIL import Image
def detect_and_annotate(
input_image: np.ndarray,
confidence_threshold: float,
iou_threshold: float = 0,
):
model = get_model(model_id="coco/22")
result = model.infer(
input_image,
confidence=confidence_threshold,
iou_threshold=iou_threshold
)[0]
detections = sv.Detections.from_inference(result)
print(detections.data['class_name'])
detect_and_annotate(Image.open('vehicles.png'), 0.4)
from ultralytics import YOLOv10
model = YOLOv10('/tmp/cache/coco/22/weights.onnx', task='detect')
model.predict(source=Image.open('vehicles.png'), verbose=True, conf=0.4)
The output is:
# Roboflow inference
['truck' 'car' 'car']
# This codebase
Loading /tmp/cache/coco/22/weights.onnx for ONNX Runtime inference...
0: 640x640 3 cars, 1 truck, 16.7ms
Speed: 11.3ms preprocess, 16.7ms inference, 15.7ms postprocess per image at shape (1, 3, 640, 640)
We observe that one truck and two cars are detected with Roboflow inference, while one truck and three cars are detected in our codebase. May we ask for your help? Thanks a lot!
@salwaghanim, thanks a lot! The UI is built with gradio
.
@jameslahm I'll let @NickHerrig try to investigate that.
@SkalskiP and @jameslahm It appears that the different prediction confidence scores are the result of different preprocessing steps (resizing) in inference and yolo cli. Was able to run a test on yolo cli and roboflow/inference with the images already resized to 640px and am seeing the same predictions and confidence scores.
Take a look at the below image where on the right we see inference results and on the left we see yolo cli results:
@NickHerrig Thanks a lot for your great efforts! Is the different preprocessing step between inference
and yolo cli
expected?
@SkalskiP It seems that we and @NickHerrig have identified the root cause. One of the reasons is that roboflow inference invokes the NMS in the postprocessing of YOLOv10, which is not needed as it does not rely on NMS. Besides, the exported onnx files may be corrupted, and replacing our exported onnx models leads to the same results as our local environment. We have submitted a PR https://github.com/roboflow/inference/pull/437 to fix these. Thank you!
@SkalskiP The PR https://github.com/roboflow/inference/pull/437 has been merged. The results of roboflow Inference and our local environment are the same now. Would you mind updating the inference
version in the requirements.txt
of the HF Space? Thanks a lot!
@SkalskiP Friendly ping :) Thanks!
@SkalskiP We opened a PR in https://huggingface.co/spaces/SkalskiP/YOLO-ARENA/discussions/2 to update the inference
version of the HF Space? Would you mind taking a look? Thanks a lot!
Hi 👋🏻
I noticed that YOLOv10 has trouble detecting small objects, especially compared to YOLOv8 and YOLOv9. I have built a small HF Space where you can test this. Is this a known issue? Czy mogę coś zrobić by poprawić ten performance w relacji do pozostałych modeli.
Here is the comparison of YOLOv8l at 640x640 and YOLOv10l at 640x640:
https://github.com/THU-MIG/yolov10/assets/26109316/94ad1c43-80dd-402e-a8cf-de51aea63560