Open isr431 opened 2 weeks ago
+1 This would be another great addition!
This model is awesome
I am looking forward to it very much
+1 I am looking forward to it very much
We can try Llamafing it
+1
+1
+1
+1
+1
+1
+1
+1
Any updates?
+1
+1
+1
+1
+1
+1
I can not wait for it !!!
Maybe people should also express interest and ask Qwen2-VL devs to implement. https://github.com/QwenLM/Qwen2-VL/issues/7
Expect to use llama.cpp end side inference
Is anyone already working on this? If not, I would like to give it a try.
+1 is there any updates?
+1
+1
+1
Prerequisites
Feature Description
Qwen just released Qwen2-VL 2B & 7B under the Apache 2.0 License.
Motivation
SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc.
Possible Implementation
No response