-
## ❓ General Questions
hi,all
I'm trying to port Microsoft's Florence-2-large model to mlc recently. It seems to be able to run initially, but I have a problem. Multimodal LLM models usually have …
-
Great work!
After reading your paper _SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs_ , I'm very interested in the implementation, especially how the image is reconstr…
-
The currently followed architecture of is still too closely bound to traditional NLU based voice interaction concepts. While it aimed at including LLM with speech, LLM with multimodality, ... it is po…
-
Multimodal has been removed since https://github.com/ggerganov/llama.cpp/pull/5882
Depends on the refactoring of `llava`, we will be able to bring back the support: https://github.com/ggerganov/lla…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Hello!
Let's say I use a multimodal modal like `gpt-4o` and a text model like `gemin…
-
Following up on Cogvlm, CogVlm2 is here: https://github.com/THUDM/CogVLM2
Easily one of the best open-source multimodal model, that is competitive to GPT-4 and Gemini.
https://github.com/THUDM/Co…
-
Hello,
I download the model(NCSOFT/Llama-3-OffsetBias-RM-8B) from hugginface。
and then run the code below:
```
pip install -r requirements.txt
```
and then
```
from module import VllmModule
…
-
### System Info
- NVIDIA A100 80G * 2
- Libraries
- TensorRT-LLM: 0.11.0.dev2024052800
- Driver Version: 525.105.17
- CUDA Version: 12.4
### Who can help?
@byshiue @schetlur-nv
##…
-
The report contains a large number of tables and figures that contain much information not mentioned in the text. In this stage, we focus on converting the text to RDF, but these tables and figures al…
-
### System Info
GPU: a10g
### Who can help?
@kaiyux
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially supported task…