Open lambda7xx opened 1 year ago
nstalled CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combinationInstalled CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/lambda7xx/.cache/torch_extensions/py38_cu117/transformer_inference/build.ninja...
Building extension module transformer_inference...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module transformer_inference...
Loading extension module transformer_inference...
Loading extension module transformer_inference...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.5659897327423096 seconds
Time to load transformer_inference op: 0.5149312019348145 seconds
Time to load transformer_inference op: 0.5152051448822021 seconds
[2023-03-03 03:40:39,841] [INFO] [logging.py:75:log_dist] [Rank 0] DeepSpeed-Inference config: {'layer_id': 0, 'hidden_size': 9216, 'intermediate_size': 36864, 'heads': 72, 'num_hidden_layers': -1, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'stochastic_mode': False, 'epsilon': 1e-12, 'mp_size': 2, 'q_int8': False, 'scale_attention': True, 'triangular_masking': True, 'local_attention': False, 'window_size': 1, 'rotary_dim': -1, 'rotate_half': False, 'rotate_every_two': True, 'return_tuple': True, 'mlp_after_attn': True, 'mlp_act_func_type': <ActivationFuncType.ReLU: 2>, 'specialized_mode': False, 'training_mp_size': 1, 'bigscience_bloom': False, 'max_out_tokens': 1024, 'scale_attn_by_inverse_layer_idx': False, 'enable_qkv_quantization': False, 'use_mup': False, 'return_single_tuple': False}
Time to load transformer_inference op: 0.5122606754302979 seconds
Loading extension module transformer_inference...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.5149903297424316 seconds
Time to load transformer_inference op: 0.5106439590454102 seconds
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.6110687255859375 seconds
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.6097698211669922 seconds
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.09874415397644043 seconds
Installed CUDA version 11.8 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.10198044776916504 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.10840225219726562 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.11503076553344727 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.10785055160522461 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.10958147048950195 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.10695767402648926 seconds
Using /home/lambda7xx/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.11722230911254883 seconds
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 532, in replace_transformer_layer
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 532, in replace_transformer_layer
replaced_module = replace_module(model=model,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 797, in replace_module
replaced_module = replace_module(model=model,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 797, in replace_module
replacedmodule, = _replace_module(model, policy)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replace_module
replacedmodule, = _replace_module(model, policy)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 824, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 814, in _replacemodule
, layer_id = _replace_module(child, policies, layer_id=layer_id)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 814, in _replace_module
replaced_module = policies[child.class][0](child,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 522, in replace_fn
new_module = replace_with_policy(child,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 383, in replace_with_policy
replaced_module = policies[child.class][0](child,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 522, in replace_fn
_container.create_module()
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/containers/opt.py", line 21, in create_module
new_module = replace_with_policy(child,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 383, in replace_with_policy
self.module = DeepSpeedOPTInference(_config, mp_group=self.mp_group)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_opt.py", line 18, in init
super().init(config,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 70, in init
_container.create_module()
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/containers/opt.py", line 21, in create_module
self.mlp = DeepSpeedMLP(self.config,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_mlp.py", line 45, in init
self.module = DeepSpeedOPTInference(_config, mp_group=self.mp_group)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_opt.py", line 18, in init
self.output_w = nn.Parameter(torch.empty(intm_size_per_partition,
torch.cuda .super().init(config,OutOfMemoryError
: File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 70, in init
CUDA out of memory. Tried to allocate 324.00 MiB (GPU 3; 31.75 GiB total capacity; 31.08 GiB already allocated; 216.50 MiB free; 31.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
self.mlp = DeepSpeedMLP(self.config,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_mlp.py", line 45, in init
self.output_w = nn.Parameter(torch.empty(intm_size_per_partition,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 324.00 MiB (GPU 7; 31.75 GiB total capacity; 31.08 GiB already allocated; 216.50 MiB free; 31.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
- opt-66b model should be about 132GB memory and I have 256GB GPU memory. I think it should not OOM
My deepspeed is 0.8.1 and transformers is 4.21.2 and I have 8 V100 32GB on my machine
Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s] Loading 14 checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s]Traceback (most recent call last): File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Loading 14 checkpoint shards: 0%| | 0/14 [00:05<?, ?it/s] Traceback (most recent call last): File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Loading 14 checkpoint shards: 0%| | 0/14 [00:05<?, ?it/s]
Loading 14 checkpoint shards: 0%| | 0/14 [00:05<?, ?it/s]
Loading 14 checkpoint shards: 0%| | 0/14 [00:05<?, ?it/s] Traceback (most recent call last): File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
Loading 14 checkpoint shards: 0%| | 0/14 [00:06<?, ?it/s]
Loading 14 checkpoint shards: 0%| | 0/14 [00:06<?, ?it/s] PHLRR4036:4558:5025 [4] NCCL INFO [Service thread] Connection closed by localRank 4 Traceback (most recent call last): File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
PHLRR4036:4558:4558 [4] NCCL INFO comm 0x4a158970 rank 4 nranks 8 cudaDev 4 busId 83000 - Abort COMPLETE
Traceback (most recent call last):
File "bloom-inference-scripts/bloom-ds-inference.py", line 185, in
model = deepspeed.init_inference(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/init.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 134, in init
self._apply_injection_policy(config)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 358, in _apply_injection_policy
replace_transformer_layer(client_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 561, in replace_transformer_layer
load_model_with_checkpoint(replaced_module,
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 279, in load_model_with_checkpoint
load_module_recursive(r_module)
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 273, in load_module_recursive
load_module_recursive(
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 271, in load_module_recursive
layer_policies[child.class](child, prefix + name + '.')
File "/home/lambda7xx/.local/lib/python3.8/site-packages/deepspeed/module_inject/load_checkpoint.py", line 200, in load_transformer_layer
replace_policy.load_params(module,
AttributeError: 'HFOPTLayerPolicy' object has no attribute 'load_params'
PHLRR4036:4556:5028 [3] NCCL INFO [Service thread] Connection closed by localRank 3
PHLRR4036:4556:4556 [3] NCCL INFO comm 0x4a9c1610 rank 3 nranks 8 cudaDev 3 busId 13000 - Abort COMPLETE
Loading 14 checkpoint shards: 0%| | 0/14 [00:06<?, ?it/s] PHLRR4036:4560:5029 [5] NCCL INFO [Service thread] Connection closed by localRank 5 PHLRR4036:4560:4560 [5] NCCL INFO comm 0x498b20b0 rank 5 nranks 8 cudaDev 5 busId 89000 - Abort COMPLETE
Loading 14 checkpoint shards: 0%| | 0/14 [00:06<?, ?it/s] PHLRR4036:4553:5017 [0] NCCL INFO [Service thread] Connection closed by localRank 0 PHLRR4036:4553:4553 [0] NCCL INFO comm 0x4b092180 rank 0 nranks 8 cudaDev 0 busId 5000 - Abort COMPLETE PHLRR4036:4554:5024 [1] NCCL INFO [Service thread] Connection closed by localRank 1 PHLRR4036:4554:4554 [1] NCCL INFO comm 0x4a1b1a70 rank 1 nranks 8 cudaDev 1 busId 8000 - Abort COMPLETE PHLRR4036:4564:5026 [7] NCCL INFO [Service thread] Connection closed by localRank 7 PHLRR4036:4564:4564 [7] NCCL INFO comm 0x4a6a3370 rank 7 nranks 8 cudaDev 7 busId 91000 - Abort COMPLETE PHLRR4036:4562:5027 [6] NCCL INFO [Service thread] Connection closed by localRank 6 PHLRR4036:4555:5023 [2] NCCL INFO [Service thread] Connection closed by localRank 2 PHLRR4036:4562:4562 [6] NCCL INFO comm 0x4944deb0 rank 6 nranks 8 cudaDev 6 busId 8e000 - Abort COMPLETE PHLRR4036:4555:4555 [2] NCCL INFO comm 0x47643180 rank 2 nranks 8 cudaDev 2 busId d000 - Abort COMPLETE [2023-03-03 03:41:22,399] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4553 [2023-03-03 03:41:22,433] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4554 [2023-03-03 03:41:23,542] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4555 [2023-03-03 03:41:23,839] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4556 [2023-03-03 03:41:23,842] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4558 [2023-03-03 03:41:23,843] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4560 [2023-03-03 03:41:23,846] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4562 [2023-03-03 03:41:23,848] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 4564 [2023-03-03 03:41:23,850] [ERROR] [launch.py:324:sigkill_handler] ['/usr/bin/python3', '-u', 'bloom-inference-scripts/bloom-ds-inference.py', '--local_rank=7', '--name', 'facebook/opt-66b', '--batch_size', '4', '--tp_size', '4', '--benchmark'] exits with return code = 1