I try to deepspeed local mode, download huggingface bigscience/bloomz-7b1-mt
set tensor_parallel=4 run success, but set tensor_parallel 5、6、7、8,it’s doesn't work
[2023-04-28 11:44:09,797] [ERROR] [launch.py:434:sigkill_handler] ['/data/project/DeepSpeed-MII/venv/bin/python', '-m',
'mii.launch.multi_gpu_server', '--deployment-name', 'bloomz-7b1-mt_deployment', '--task-name', 'text-generation',
'--model', '/data/model/bloomz-7b1-mt', '--model-path', '/tmp/mii_models', '--port', '50950', '--ds-optimize', '--provider',
'hugging-face', '--config', 'eyJ0ZW5zb.........'] exits with return code = -7
[2023-04-28 11:44:09,892] [INFO] [server.py:82:_wait_until_server_is_live] waiting for server to start...
Traceback (most recent call last):
File "/data/project/DeepSpeed-MII/examples/local/text-generation-bloom-example.py", line 48, in <module>
mii.deploy(task='text-generation',
File "/data/project/DeepSpeed-MII/mii/deployment.py", line 142, in deploy
return _deploy_local(deployment_name, model_path=model_path)
File "/data/project/DeepSpeed-MII/mii/deployment.py", line 148, in _deploy_local
mii.utils.import_score_file(deployment_name).init()
File "/tmp/mii_cache/bloomz-7b1-mt_deployment/score.py", line 27, in init
mii.MIIServer(deployment_name,
File "/data/project/DeepSpeed-MII/mii/server.py", line 67, in __init__
self._wait_until_server_is_live(processes, deployment)
File "/data/project/DeepSpeed-MII/mii/server.py", line 79, in _wait_until_server_is_live
raise RuntimeError(
RuntimeError: server crashed for some reason, unable to proceed
I try to deepspeed local mode, download huggingface bigscience/bloomz-7b1-mt set tensor_parallel=4 run success, but set tensor_parallel 5、6、7、8,it’s doesn't work