microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.91k stars 175 forks source link

Unable to run benchmark #170

Closed CrossNox closed 1 year ago

CrossNox commented 1 year ago

I'm trying to run the txt2img benchmark, but can't get it to work.

I tried several combinations of versions for deepspeed (incl. from source), deepspeed-mii (incl. from source), torch==1.13.1 (also with 1.12.1 when using cuda 11.6), diffusers, transformers and triton.

Environment:

pip install -U deepspeed==0.7.5 deepspeed-mii==0.0.3 torch==1.13.1 diffusers==0.7.1 transformers==4.24.0 triton==2.0.0.dev20221030 ftfy, pip install -U git+https://github.com/microsoft/DeepSpeed.git@c9c6ab9e32b054136c3a125900d6e2ed937432be deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy, pip install -U deepspeed==0.8.0 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy, pip install -U deepspeed==0.8.0 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy, pip install -U deepspeed==0.7.7 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.10.2 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy all fail with {created_time:"2023-04-24T15:34:23.563287825+00:00", grpc_status:2, grpc_message:"Exception calling application: Triton Error [CUDA]: invalid argument"}"

pip install -U deepspeed==0.9.1 deepspeed-mii==0.0.3 torch==1.13.1 diffusers==0.14.0 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy fails with ImportError: cannot import name 'CLIPTextModelWithProjection' from 'transformers' (/home/ec2-user/.local/lib/python3.9/site-packages/transformers/__init__.py)

pip install -U deepspeed==0.9.1 deepspeed-mii==0.0.3 torch==1.13.1 diffusers==0.14.0 transformers==4.26.0 triton==2.0.0.dev20221202 ftfy fails with AttributeError: 'StableDiffusionPipeline' object has no attribute 'children'

pip install -U deepspeed==0.7.5 deepspeed-mii==0.0.3 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy fails with {created_time:"2023-04-24T16:19:36.666852314+00:00", grpc_status:2, grpc_message:"Exception calling application: \'DSUNet\' object has no attribute \'config\'"}"

pip install -U git+https://github.com/microsoft/DeepSpeed.git@35eabb0a336e7a8e9950a550475ceaebda42066c deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy fails with {grpc_message:"Exception calling application: forward() got an unexpected keyword argument \'encoder_hidden_states\'", grpc_status:2, created_time:"2023-04-24T16:34:55.067998104+00:00"}". Same for pip install -U deepspeed==0.7.7 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy

Where this issue seems relevant.

pip install -U deepspeed==0.7.6 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy fails with {created_time:"2023-04-24T17:29:38.723537173+00:00", grpc_status:2, grpc_message:"Exception calling application: \'DSUNet\' object has no attribute \'config\'"}". Same for pip install -U deepspeed==0.7.6 deepspeed-mii==0.0.4 torch==1.13.1 diffusers==0.11.1 transformers==4.24.0 triton==2.0.0.dev20221202 ftfy

Where this other issue seems relevant.

Is there any set of versions that is well known to work?

mrwyattii commented 1 year ago

@CrossNox thank you for reporting this issue. It looks like a recent change in DeepSpeed was not accounted for in MII. Please try this PR: #172

I just tested with the following:

deepspeed          0.9.2
diffusers          0.14.0
torch              1.13.1
transformers       4.28.1
CrossNox commented 1 year ago

Hi @mrwyattii, thanks for the reply. deepspeed 0.9.2 is not avaialable on pypi, nor is a tagged version on github. Did you install from source? At which commit?

Also, which triton version did you get it to work with?

For completeness:

packages installed

Details

```text $ pip show deepspeed diffusers torch transformers triton deepspeed-mii Name: deepspeed Version: 0.9.2+0e357666 Summary: DeepSpeed library Home-page: http://deepspeed.ai Author: DeepSpeed Team Author-email: deepspeed-info@microsoft.com License: MIT Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: hjson, ninja, numpy, packaging, psutil, py-cpuinfo, pydantic, torch, tqdm Required-by: deepspeed-mii --- Name: diffusers Version: 0.14.0 Summary: Diffusers Home-page: https://github.com/huggingface/diffusers Author: The HuggingFace team Author-email: patrick@huggingface.co License: Apache Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: filelock, huggingface-hub, importlib-metadata, numpy, Pillow, regex, requests Required-by: --- Name: torch Version: 1.13.1 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: nvidia-cublas-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11, typing-extensions Required-by: deepspeed, deepspeed-mii, triton --- Name: transformers Version: 4.28.1 Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow Home-page: https://github.com/huggingface/transformers Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors) Author-email: transformers@huggingface.co License: Apache 2.0 License Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, tokenizers, tqdm Required-by: deepspeed-mii --- Name: triton Version: 2.0.0.dev20221202 Summary: A language and compiler for custom Deep Learning operations Home-page: https://github.com/openai/triton/ Author: Philippe Tillet Author-email: phil@openai.com License: Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: cmake, filelock, torch Required-by: --- Name: deepspeed-mii Version: 0.0.5+835a2a9 Summary: deepspeed mii Home-page: http://deepspeed.ai Author: DeepSpeed Team Author-email: deepspeed-mii@microsoft.com License: UNKNOWN Location: /home/ec2-user/venv39/lib/python3.9/site-packages Requires: asyncio, deepspeed, Flask-RESTful, grpcio, grpcio-tools, pydantic, torch, transformers, Werkzeug Required-by: ```

output of ds_report

Details

```text $ ds_report -------------------------------------------------- DeepSpeed C++/CUDA extension op report -------------------------------------------------- NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op. -------------------------------------------------- JIT compiled ops requires ninja ninja .................. [OKAY] -------------------------------------------------- op name ................ installed .. compatible -------------------------------------------------- async_io ............... [NO] ....... [OKAY] cpu_adagrad ............ [NO] ....... [OKAY] cpu_adam ............... [NO] ....... [OKAY] fused_adam ............. [NO] ....... [OKAY] fused_lamb ............. [NO] ....... [OKAY] quantizer .............. [NO] ....... [OKAY] random_ltd ............. [NO] ....... [OKAY] [WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible sparse_attn ............ [NO] ....... [NO] spatial_inference ...... [NO] ....... [OKAY] transformer ............ [NO] ....... [OKAY] stochastic_transformer . [NO] ....... [OKAY] transformer_inference .. [NO] ....... [OKAY] utils .................. [NO] ....... [OKAY] -------------------------------------------------- DeepSpeed general environment info: torch install path ............... ['/home/ec2-user/venv39/lib/python3.9/site-packages/torch'] torch version .................... 1.13.1+cu117 deepspeed install path ........... ['/home/ec2-user/venv39/lib/python3.9/site-packages/deepspeed'] deepspeed info ................... 0.9.2+0e357666, 0e357666, master torch cuda version ............... 11.7 torch hip version ................ None nvcc version ..................... 11.7 deepspeed wheel compiled w. ...... torch 0.0, cuda 0.0 ```

python


$ python --version
Python 3.9.16

error

Details

```text Traceback (most recent call last): File "/home/ec2-user/DeepSpeed-MII/examples/benchmark/txt2img/mii-sd.py", line 27, in results = pipe.query(prompts) File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 125, in query response = self.asyncio_loop.run_until_complete( File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete return future.result() File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 109, in _query_in_tensor_parallel await responses[0] File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 72, in _request_async_response proto_response = await getattr(self.stub, conversions["method"])(proto_request) File "/home/ec2-user/venv39/lib/python3.9/site-packages/grpc/aio/_call.py", line 290, in __await__ raise _create_rpc_error(self._cython_call._initial_metadata, grpc.aio._call.AioRpcError: ```

Error with triton==2.0.0.post1

Details

```text Traceback (most recent call last): File "/home/ec2-user/DeepSpeed-MII/examples/benchmark/txt2img/mii-sd.py", line 27, in results = pipe.query(prompts) File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 125, in query response = self.asyncio_loop.run_until_complete( File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete return future.result() File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 109, in _query_in_tensor_parallel await responses[0] File "/home/ec2-user/venv39/lib/python3.9/site-packages/mii/client.py", line 72, in _request_async_response proto_response = await getattr(self.stub, conversions["method"])(proto_request) File "/home/ec2-user/venv39/lib/python3.9/site-packages/grpc/aio/_call.py", line 290, in __await__ raise _create_rpc_error(self._cython_call._initial_metadata, grpc.aio._call.AioRpcError: ```

CrossNox commented 1 year ago

Hi, I'm closing this. I kinda "solved" it by changing the GPU. I was using a T4, but with an A10G it mostly works. I think the issue is that some kernels are not compatible with architectures older than Ampere.

mrwyattii commented 1 year ago

@CrossNox you will need to use an older dev release of triton (we haven't finished updating DeepSpeed to use the latest Triton 2.0.0 release): https://pypi.org/project/triton/2.0.0.dev20221202/

And yes I installed from source. You can just do pip install git+https://github.com/microsoft/deepspeed

CrossNox commented 1 year ago

@mrwyattii yes, I tried that as well. But there was no case until I figured out that the issue was the GPU.