How many GPU memory is required to inference CogVLM2 in float16

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Apache License 2.0

2.02k stars 134 forks source link

Closed Luciennnnnnn closed 2 months ago

Luciennnnnnn commented 3 months ago

How many GPU memory is required to inference CogVLM2 in float16

liuky74 commented 3 months ago

16GB*3

Luciennnnnnn commented 3 months ago

16GB*3

where *3 comes from?

liuky74 commented 3 months ago

16GB*3

where *3 comes from?

I deployed the FP16 model on my 3*3090 server, nvidia-smi like:

KSimulation commented 3 months ago

Would a single A100 with 40G vram be ok?

zRzRzRzRzRzRzR commented 2 months ago

work，but only few conversation and it will overflow ， more than 40G