[Help] <针对chatglm6b，不启动offload，zero_stage=3的状态下，单机4卡下，貌似需要GPU28g的显存，这貌似没有用到模型并行的能力，请问这是什么原因？>

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

(venv) [app@vm_0_1_centos projects]$ python ds_estimate.py

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:12<00:00, 1.59s/it] Estimated memory needed for params, optim states and gradients for a: HW: Setup with 1 node, 4 GPUs per node. SW: Model with 6173M total params, 534M largest layer params. per CPU | per GPU | Options 155.23GB | 1.99GB | offload_param=cpu , offload_optimizer=cpu , zero_init=1 155.23GB | 1.99GB | offload_param=cpu , offload_optimizer=cpu , zero_init=0 137.98GB | 4.87GB | offload_param=none, offload_optimizer=cpu , zero_init=1 137.98GB | 4.87GB | offload_param=none, offload_optimizer=cpu , zero_init=0 11.95GB | 27.86GB | offload_param=none, offload_optimizer=none, zero_init=1 137.98GB | 27.86GB | offload_param=none, offload_optimizer=none, zero_init=0

ds_estimate.py ：

from transformers import AutoModel from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live

model = AutoModel.from_pretrained('/data/projects/ChatGLM-6B', trust_remote_code=True) estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=4, num_nodes=1)

Expected Behavior

No response

Steps To Reproduce

(venv) [app@vm_0_1_centos projects]$ python ds_estimate.py

ds_estimate.py ：

from transformers import AutoModel from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live

model = AutoModel.from_pretrained('/data/projects/ChatGLM-6B', trust_remote_code=True) estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=4, num_nodes=1)

Environment

- OS: Centos7
- Python: 3.8
- Transformers: 4.29.1
- PyTorch: 2.0.8
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

THUDM / ChatGLM-6B