-
### Motivation
Hi team,
ZhipuAI just released their multi-modal model `CogVLM2-Video-LLama3-Chat`. Can we support its serving with TorchEngine? It seems that they use a new causal model architectu…
-
Popular VLMs for image captioning like blip2 and CogVLM are not true open source as either their weights or training data restrict commercial use
-
File "demo.py", line 44, in
model, model_args =AutoModel.from_pretrained('/code/CogVLM-main/vicuna-7b-v1.5', args=argparse.Namespace(
File "/opt/conda/lib/python3.8/site-packages/sat/model/ba…
-
**例行检查**
[//]: # (方框内删除已有的空格,填 x 号)
+ [x] 我已确认目前没有类似 issue
+ [x] 我已确认我已升级到最新版本
+ [x] 我已完整查看过项目 README,已确定现有版本无法满足需求
+ [x] 我理解并愿意跟进此 issue,协助测试和提供反馈
+ [x] 我理解并认可上述内容,并理解项目维护者精力有限,**不遵循规则的 issue…
-
### Search before asking
- [X] I have searched the Inference [issues](https://github.com/roboflow/inference/issues) and found no similar bug report.
### Bug
I'm encountering an issue while attempt…
-
https://github.com/THUDM/CogVLM is an open-source model we released, with demo. It performs better than LLaVA in most cases.
-
I have encountered the problem mentioned in the title. Could someone help me understand what is going on and how to resolve it?
Any assistance would be greatly appreciated.
-
Across different LMMs the max new token is different .
I believe we should have a consistent MAX_NEW_TOKENS across the project, set to 512 or 1024
If it makes sense, I can create a PR to modify al…
-
### Describe the bug
Due to network restrictions, I cannot use Xinference to pull models online. I downloaded the model weight of cogvlm2-llama3-chinese-chat-19B to the local computer, and then used …
-
### Feature request / 功能建议
Hi, I've found the support of the downstream task, referring expression comprehension (REC)
and the model in HF, THUDM/cogvlm-grounding-generalist-hf
Want to ask if ther…