Open Iven2132 opened 5 days ago
@Iven2132 do you mean integration into HF transformers itself?
this may be a bit challenging to support both GPU & TPU, but is something we may investigate later.
our current code depends on HF transformers and should be able to used with it.
we specifically target vision-centric capabilities, but our model is general-purpose. see more info on our site or in the paper https://cambrian-mllm.github.io/
@ellisbrown Yes, I mean in the HF transformers itself. What are target vision-centric capabilities? Can It write code from a given UI etc?
We didn't target generating code from a UI specifically. You can certainly try, but no guarantees there.
As for vision-centric capabilities: have a look at the benchmarks that we classify as "vision-centric" for a better idea—MMVP, Real World QA, and the CV-Bench we introduced.
You can read more about our CV-Bench benchmark in section 3.2. We test 4 different vision-centric capabilities.
It would be cool if this model could have transformer support, also what is the specialty of this model? what is something that this model is good at?