WongKinYiu / YOLO

An MIT rewrite of YOLOv9
MIT License
292 stars 19 forks source link

Dets should have the same type as scores #22

Closed Alan01252 closed 1 month ago

Alan01252 commented 1 month ago

Hi,

Firstly thanks for the excellent work on this!

I've been trying the examples this morning ( and I realise this repository is all very new ) but am unable to get the class's/scores out of the image.

Is this expected with the current state of the project? My assumption is not because otherwise you'd also be seeing this error.

If I "fix" the error

valid_box = valid_box.float()
valid_cls = valid_cls.float() 

Then I end up with an image like the below ( which I assume isn't what's expected):

image

python examples/lazy.py device=cpu task=inference task.source=./demo/images/inference/image.png

I am wondering if it's something to do with device=cpu? As I think it's directly from the output of the model.

I am new'ish to machine learning computer vision and have tried to dig into what's going wrong, but am lacking skills, so any point in the right direction as how I could diagnose further/help fix would be greatly appreciated too.

henrytsui000 commented 1 month ago

Hi Alan01252,

Thank you for pointing out the issues!

The Error

The exception is indeed caused by the CPU. I suspect it's related to the Torch JIT or AMP. When running on a GPU, Torch might automatically ensure datatype consistency, but this doesn't seem to be the case in CPU mode. To address this, I will investigate further and include a fix in the upcoming push. In the meantime, you can try modifying bounding_box_utils.py from: https://github.com/WongKinYiu/yolov9mit/blob/7dd7b628a945e37ec494b3278f432c194da1aa96/yolo/utils/bounding_box_utils.py#L300 to:

valid_cls = cls_idx[valid_mask].float()

as you mentioned. This should resolve the issue directly.

The Prediction Result

Regarding the prediction results, I believe they are correct after your code modification. The numerous strange bounding boxes on the image are likely due to the NMS confidence, which is the prediction confidence rate. To improve this, you can try adding task.nms.min_confidence=0.5 to your command line. This adjustment should make the predictions more robust.

$python yolo/lazy.py device=cpu task=inference task.nms.min_confidence=0.75

Please let me know if you encounter any further issues or have additional questions.

Best regards,
Hao-Tang Tsui

Alan01252 commented 1 month ago

Thanks @henrytsui000,

The boundary boxes actually look good, it's the numbers that are confusing me, they are all 0, and I suspect that's why the .float is required?

henrytsui000 commented 1 month ago

Hi,

In the MS COCO dataset, the class index of people is 0, so we follow their settings here. I understand this may cause misunderstanding. So if I have time, I will change the annotation to the object name and confidence appearance.

Hao-Tang, Tsui

Alan01252 commented 1 month ago

Oh that makes perfect sense! Thank you for clarifying, I feel stupid :)

Alan01252 commented 1 month ago

I knocked up a little feature here

https://github.com/Alan01252/yolov9mit/tree/feature/improve-draw

image

henrytsui000 commented 1 month ago

Thank you for sharing your code! I have briefly reviewed it, and I must say that most of the code is clean and well-structured. Great job!

I have a suggestion for a minor revision: specifically, it would be beneficial to include the class names in coco.yaml and modify inference.yaml to load COCO's classes by default.

# yolo/config/task/dataset/coco.yaml
path: ...
class_name: [the class list]
autodownload: ...

During the inference stage, we could check if the inference.yaml configuration has a class list and draw the image with the class name accordingly. If the class list is not provided, we can default to showing the class index.

Would you be able to open a pull request for this change? Alternatively, would it be alright if I borrow some of your code to make these adjustments?

Looking forward to your response!

Alan01252 commented 1 month ago

hi @henrytsui000

I am happy for you to use the code however you want.

I did attempt to make the changes suggested but got confused by the requirements sorry.

Moving the class names into # yolo/config/task/dataset/coco.yaml like this

path: data/coco
train: train2017
class_names:
  person
  bicycle
  car

makes sense to me, but I got confused by the requirement to:

modify inference.yaml to load COCO's classes by default.

I'm not sure what structure inference.yaml would take

henrytsui000 commented 3 weeks ago

Hi,

Sorry for the late reply; I've been quite busy these days. I wanted to let you know that I have implemented your suggestion in commit a0976c99ab400740da9269e4eca1116118f246f0. You can try visualizing it if you'd like!

I deeply appreciate your suggestion and had to refer to your code to make this happen. Thank you for your valuable input!

I also apologize for the misleading information in my last reply. After some trials, I believe that placing the class names in general.yaml may be the best approach.

Best regards, Hao-Tang, Tsui

Alan01252 commented 3 weeks ago

Thank you for the kind words! I have checked out what you've done and it all looks great .

Also thanks for all your hardwork on this, I am learning a TON following the progress of this repository!