zhangfaen / finetune-Qwen2-VL

MIT License
133 stars 17 forks source link

Qwen2-VL support detection #12

Open WangRongsheng opened 1 day ago

WangRongsheng commented 1 day ago

This will help you. In fact, qwen2-vl support object detection!

zhangfaen commented 1 day ago

谢谢 Rongsheng提供的几个链接, 我快速看了一下里面的讨论, 发现 qwen-vl 和 qwen2-vl确实有考虑支持 object detection,但貌似qwen2-vl在object detection 性能上很差... 这个可能也是qwen2-vl发布的时候,没有刻意提 他们支持object detection的原因。 看起来 他们说 后续会提供一个方案。

image

谢谢 上面的几个链接, 很有帮助。

WangRongsheng commented 14 hours ago

Maybe this pr had solved this issue.

https://github.com/huggingface/transformers/pull/33487/files

zhangfaen commented 11 hours ago

Thank you Rongsheng. https://github.com/huggingface/transformers/pull/33487/files looks a reasonable fix. Hope it will be merged into main branch of transformers lib.

In this repo, I may cherry pick that PR first and have another try.

WangRongsheng commented 10 hours ago

look forward to it.