Open liding1992 opened 10 months ago
I also encountered a similar problem
I also encountered the same problem when using deepspeed
I also encountered the same problem when using deepspeed
May I ask if it has been resolved? I also have this problem
here is my GPUs info:
model name:
WizardCoder-15B-V1.0
here is my pkgs info:
Python 3.10.9 cuda 11.7 torch 1.13.1+cu117 transformers 4.28.1
here is my ds_report:
here is my cmd:
deepspeed --num_gpus 2 ds-WizardCoder.py --base_model my_model_path
here is my inference code:
I get a error :
Traceback (most recent call last): File "/home/liding/work/DeepSpeedExamples/inference/huggingface/text-generation/ds-WizardCoder.py", line 122, in <module> fire.Fire(main) File "/home/liding/work/ds_venv/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/liding/work/ds_venv/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/liding/work/ds_venv/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/liding/work/DeepSpeedExamples/inference/huggingface/text-generation/ds-WizardCoder.py", line 88, in main model = AutoModelForCausalLM.from_pretrained( File "/home/liding/work/ds_venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 471, in from_pretrained return model_class.from_pretrained( File "/home/liding/work/ds_venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained ) = cls._load_pretrained_model( File "/home/liding/work/ds_venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3123, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/home/liding/work/ds_venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 698, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/home/liding/work/ds_venv/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 313, in set_module_tensor_to_device new_value = value.to(device) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 288.00 MiB (GPU 0; 22.06 GiB total capacity; 8.58 GiB already allocated; 75.38 MiB free; 8.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF