Open xzzWZY opened 11 months ago
@xzzWZY This looks like a bug that we fixed in the latest DeepSpeed. We will push a release soon. In the meantime, please install with:
pip install git+https://github.com/Microsoft/DeepSpeed.git git+https://github.com/Microsoft/DeepSpeed-MII.git
@xzzWZY This looks like a bug that we fixed in the latest DeepSpeed. We will push a release soon. In the meantime, please install with:
pip install git+https://github.com/Microsoft/DeepSpeed.git git+https://github.com/Microsoft/DeepSpeed-MII.git
my python env is that: deepspeed 0.14.1+a8b82153 deepspeed-kernels 0.0.1.dev1698255861 deepspeed-mii 0.2.4+26a853d
but error is still show....
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.2
[WARNING] using untested triton version (2.2.0), only 1.0.0 is known to be compatible
usage: multi_gpu_server.py [-h] [--deployment-name DEPLOYMENT_NAME] [--model-config MODEL_CONFIG] [--server-port SERVER_PORT] [--zmq-port ZMQ_PORT] [--load-balancer]
[--load-balancer-port LOAD_BALANCER_PORT] [--restful-gateway] [--restful-gateway-port RESTFUL_GATEWAY_PORT] [--restful-gateway-host RESTFUL_GATEWAY_HOST]
[--restful-gateway-procs RESTFUL_GATEWAY_PROCS]
multi_gpu_server.py: error: argument --deployment-name: expected one argument
[2024-04-12 00:45:57,354] [INFO] [launch.py:316:sigkill_handler] Killing subprocess 2282
[2024-04-12 00:45:57,354] [ERROR] [launch.py:322:sigkill_handler] ['/root/miniconda3/bin/python', '-m', 'mii.launch.multi_gpu_server', '--deployment-name', '-mii-deployment', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--restful-gateway-host', 'localhost', '--restful-gateway-procs', '32', '--server-port', '50051', '--zmq-port', '25555', '--model-config', 'eyJtb2RlbF9uYW1lX29yX3BhdGgiOiAiL3Jvb3QvLmNhY2hlL21vZGVsc2NvcGUvaHViL0FJLU1vZGVsU2NvcGUvcGhpLTIvIiwgInRva2VuaXplciI6ICIvcm9vdC8uY2FjaGUvbW9kZWxzY29wZS9odWIvQUktTW9kZWxTY29wZS9waGktMi8iLCAidGFzayI6ICJ0ZXh0LWdlbmVyYXRpb24iLCAidGVuc29yX3BhcmFsbGVsIjogMSwgInF1YW50aXphdGlvbl9tb2RlIjogbnVsbCwgImluZmVyZW5jZV9lbmdpbmVfY29uZmlnIjogeyJ0ZW5zb3JfcGFyYWxsZWwiOiB7InRwX3NpemUiOiAxfSwgInN0YXRlX21hbmFnZXIiOiB7Im1heF90cmFja2VkX3NlcXVlbmNlcyI6IDIwNDgsICJtYXhfcmFnZ2VkX2JhdGNoX3NpemUiOiA3NjgsICJtYXhfcmFnZ2VkX3NlcXVlbmNlX2NvdW50IjogNTEyLCAibWF4X2NvbnRleHQiOiA4MTkyLCAibWVtb3J5X2NvbmZpZyI6IHsibW9kZSI6ICJyZXNlcnZlIiwgInNpemUiOiAxMDAwMDAwMDAwfSwgIm9mZmxvYWQiOiBmYWxzZX0sICJxdWFudGl6YXRpb24iOiB7InF1YW50aXphdGlvbl9tb2RlIjogbnVsbH19LCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJ6bXFfcG9ydF9udW1iZXIiOiAyNTU1NSwgInJlcGxpY2FfbnVtIjogMSwgInJlcGxpY2FfY29uZmlncyI6IFt7Imhvc3RuYW1lIjogImxvY2FsaG9zdCIsICJ0ZW5zb3JfcGFyYWxsZWxfcG9ydHMiOiBbNTAwNTFdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJncHVfaW5kaWNlcyI6IFswXSwgInptcV9wb3J0IjogMjU1NTV9XSwgImRldmljZV9tYXAiOiAiYXV0byIsICJtYXhfbGVuZ3RoIjogbnVsbCwgInN5bmNfZGVidWciOiBmYWxzZSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlfQ=='] exits with return code = 2
[2024-04-12 00:45:57,633] [INFO] [server.py:65:_wait_until_server_is_live] waiting for server to start...
[2024-04-12 00:45:57,633] [INFO] [server.py:65:_wait_until_server_is_live] waiting for server to start...
[2024-04-12 00:46:02,636] [INFO] [server.py:65:_wait_until_server_is_live] waiting for server to start...
[2024-04-12 00:46:02,636] [INFO] [server.py:65:_wait_until_server_is_live] waiting for server to start...
Traceback (most recent call last):
File "
when I use
to serve Llama-2 with DeepSpeed-MII, I encounter issue:
Then I try to replace llama-2-13B-chat with llama-2-7B-chat, I encountered another error:
Llama-2-13B-chat and Llama-2-7B-chat are from HF: meta-llama/Llama-2-13b-chat-hf