Closed isidentical closed 4 months ago
@isidentical hi, thanks for your information. We will include cogvlm2 after pr #1502 is merged.
any update?
any update?
hi, it's in progress. Any update will sync to this issue.
@isidentical @Jayantverma2 hi, guys. CogVLM2 models are supported in PR #1502. If you have time, have a try. Welcome to leave any comments in the PR. THX.
@RunningLeon Is this the correct way to initialize the cogvlm2?
engine = pipeline(model_path, "cogvlm2",log_level="DEBUG") I have made some changes to config.json
{ "architectures": [ "CogVLMForCausalLM" ], "auto_map": { "AutoConfig": "configuration_cogvlm.CogVLMConfig", "AutoModelForCausalLM": "modeling_cogvlm.CogVLMForCausalLM" }, "vision_config": { "dropout_prob": 0.0, "hidden_act": "gelu", "in_channels": 3, "num_hidden_layers": 63, "hidden_size": 1792, "patch_size": 14, "num_heads": 16, "intermediate_size": 15360, "layer_norm_eps": 1e-06, "num_positions": 9217, "image_size": 1344 }, "hidden_size": 4096, "intermediate_size": 14336, "num_attention_heads": 32, "max_position_embeddings": 8192, "rms_norm_eps": 1e-05, "template_version": "chat", "initializer_range": 0.02, "bos_token_id": 128000, "eos_token_id": [128001, 128009], "pad_token_id": 128002, "vocab_size": 128256, "num_hidden_layers": 32, "hidden_act": "silu", "use_cache": true, "transformers_version": "4.41.0" }
But when I am running this with this prompt
prompts = [ { 'role': 'user', 'content': [ {'type': 'text', 'text': prompt}, {'type': 'image_url', 'image_url': {'url': f'data:image/jpeg;base64,{image}'}} ] } ]
it is generating b''
@Tushar-ml hi, pls. follow examples in the document: https://lmdeploy.readthedocs.io/en/latest/inference/vl_pipeline.html#vlm-offline-inference-pipeline.
prompts should be like
prompts = [
{
'role': 'user',
'content': [
{'type': 'text', 'text': 'describe this image'},
{'type': 'image_url', 'image_url': {'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg'}}
]
}
]
@RunningLeon any docs how to run this CogVLM2 as in PR mentioned, Tokenizer need to be applied manually
awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.
@RunningLeon any docs how to run this CogVLM2 as in PR mentioned, Tokenizer need to be applied manually
@Tushar-ml hi, no need to do so for cogvlm2, but should do for cogvlm(1).
awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.
@pseudotensor hi, glad to hear that. If possible, please recommend lmdeploy to other people who are interested in deploying LLMs and VLMs. Thanks.
awesome, look forward to it. Really like lmdeploy because it's much more stable than sglang for these vision models.
@pseudotensor hi, glad to hear that. If possible, please recommend lmdeploy to other people who are interested in deploying LLMs and VLMs. Thanks.
Yes, will gladly do that.
@RunningLeon I am getting OOM in A40G, 48 GRAM. What is the recommended system for cogvlm2, as model is of size not more than 40gb
@RunningLeon I am getting OOM in A40G, 48 GRAM. What is the recommended system for cogvlm2, as model is of size not more than 40gb
@Tushar-ml hi, could you provide your sample code? Normally, you can reudce cache_max_entry_count
to reduce kv mem size and reduce max_prefill_token_num
from PytorchEngineConfig
Thanks @RunningLeon I will try this
@RunningLeon Hi! Due to server network limitations, I could not compile and install the latest lmdeploy on the server, so I downloaded an image of lmdeploy0.4.2 on docker hub and ran it, then ran cogvlm2 and reported an error:
root@gpu9:~/data/CogVLM2# python cogvlm_demo.py 2024-05-31 01:31:08,920 - lmdeploy - ERROR - TypeError: expected string or bytes-like object 2024-05-31 01:31:08,920 - lmdeploy - ERROR - test failed! model /root/data/cogvlm2-llama3-chinese-chat-19B/ requires transformers version None but transformers 4.40.2 is installed.
my code: from lmdeploy import pipeline from lmdeploy.vl import load_image
model_path = '/root/data/cogvlm2-llama3-chinese-chat-19B/'
pipe = pipeline(model_path)
image = load_image('/root/data/dataset/misumi_data/images/Misumi000006.jpg') response = pipe(('图中出现的零件是什么?', image)) print(response)
I look forward to your reply. Thank you
@GuoXu-booo hi, because cogvlm is supported in pytorch engine and can you simply clone the code from pr and run pip install -e
to install it. BTW, you better use the latest code from PR #1502. The check env part fails in your case as there's no transformers_version
in the config.json, which is fixed in here
git clone --recursive -b support-cogvlm-dev https://github.com/RunningLeon/lmdeploy.git
cd lmdeploy
pip install -e .
@RunningLeon is there any plans to use turbomind for CogVLM since it is faster for llama3?
@RunningLeon is there any plans to use turbomind for CogVLM since it is faster for llama3?
sorry. No plan yet.
Motivation
CogVLM2 is now the SOTA open source VLM for captioning tasks.
Related resources
No response
Additional context
No response