-
Hi @yxgeee @RobertLuo1 @ShiFengyuan1999 ,
Assuming we use magvit2 in image understanding, i.e., an architecture like LLaVA. It's will be easier for the LLM to understand the encoded embedding if it…
-
I am trying to add tinyllama chat to Android(Llama2 demo is a success) but failed. My operations are listed below:
* Firstly, I add `modeling_tinyllama.hpp` in tools/jni and add some lines of code to…
-
Although there are some lib wrappered vllm like TGI, but I want to know how to using vllm with stream output enabled, currently hard to found out-of-box example on it.
Typically, with original hf t…
-
I think it really really out of expect, how will a phi3 model surpass mistaral7B, in the case of VideChat2 using a gaint vision encoder?
Which part could be really work one?
-
-
Hi, I find your paper interesting. However, as a newbie in this field, I am wondering the T (number of frames) used here. Does each dataset samples the same T which are always small? and how much GPU …
-
don't use fixed image path in demo, add a new parameter, like:
`cmdParser.add("image", 'i', "specify mllm image path", false, "../assets/cat.jpg");`
-
How to perform batch inference
-
Thanks for your great work!
I understand that for open-source models, you compute the likelihood of an MLLM generating choice content based on a question.
For these closed-source models like GPT…
-
Hi, thanks for your greate work. I would like to know if you will release pre training code?