Closed AmazDeng closed 2 weeks ago
@czczup @whai362 @ErfeiCui @hjh0119 @lvhan028 @Adushar @Weiyun1025 @cg1177 @opengvlab-admin @qishisuren123 @dlutwy Could you please take a look at this issue?
@czczup @whai362 @ErfeiCui @hjh0119 @lvhan028 @Adushar @Weiyun1025 @cg1177 @opengvlab-admin @qishisuren123 @dlutwy Could you please take a look at this issue?
Please refer to link to view the minimum gpu memory requirements.
Hi, @AmazDeng 40B model costs about 80G memory. One A100 GPU is not enough. Two GPUs are required at least
@G-z-w @lvhan028 Thank you, I understand. Under the condition of an A100 GPU, lmdeploy can only load the InternVL2-40B-AWQ version
Checklist
Describe the bug
I followed the official documentation for InternVL2 and used lmdeploy to load the 40B model(https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html), but I encountered an error:
RuntimeError: [TM][ERROR] CUDA runtime error: out of memory /lmdeploy/src/turbomind/utils/memory_utils.cu:32
. My machine is an A100 80G. What could be the issue? lmdeploy officially supports the InternVL2 model.Reproduction
Environment