Stable Diffusion example config results in error

ratnopamc commented 9 months ago

I'm using the Stable Diffusion Configuration and modified it in order to run on an inferentia2 instance. My inference script is as below

from io import BytesIO
from fastapi import FastAPI
from fastapi.responses import Response
import torch
import torch_neuronx
import os
import base64

from ray import serve
from optimum.neuron import NeuronStableDiffusionXLPipeline

app = FastAPI()

os.environ["NEURON_RT_NUM_CORES"] = "2"
neuron_cores = 2

@serve.deployment(num_replicas=1, route_prefix="/")
@serve.ingress(app)
class APIIngress:
    def __init__(self, diffusion_model_handle) -> None:
        self.handle = diffusion_model_handle

    @app.get(
        "/imagine",
        responses={200: {"content": {"image/png": {}}}},
        response_class=Response,
    )
    async def generate(self, prompt: str):
        assert len(prompt), "prompt parameter cannot be empty"

        image_ref = await self.handle.generate.remote(prompt)
        image = await image_ref
        file_stream = BytesIO()
        image.save(file_stream, "PNG")
        return Response(content=file_stream.getvalue(), media_type="image/png")

@serve.deployment(
    ray_actor_options={
        "resources": {"neuron_cores": neuron_cores},
        "runtime_env": {"env_vars": {"NEURON_CC_FLAGS": "-O1"}},
    },
    autoscaling_config={"min_replicas": 1, "max_replicas": 2},
)

class StableDiffusionV2:
    def __init__(self):
        from optimum.neuron import NeuronStableDiffusionXLPipeline

        model_dir = "sdxl_neuron/"
        self.pipe = NeuronStableDiffusionXLPipeline.from_pretrained(model_dir, device_ids=[0, 1])

    def generate(self, prompt: str):
        assert len(prompt), "prompt parameter cannot be empty"
        image = self.pipe(prompt).images[0]
        return image

entrypoint = APIIngress.bind(StableDiffusionV2.bind())

When I send a request to the endpoint like http://127.0.0.1:8000/imagine?prompt={input} I get the below error

(ServeReplica:default:APIIngress pid=9170)     response = await func(request)
(ServeReplica:default:APIIngress pid=9170)   File "/home/ec2-user/aws_neuron_venv_pytorch/lib64/python3.8/site-packages/fastapi/routing.py", line 299, in app
(ServeReplica:default:APIIngress pid=9170)     raise e
(ServeReplica:default:APIIngress pid=9170)   File "/home/ec2-user/aws_neuron_venv_pytorch/lib64/python3.8/site-packages/fastapi/routing.py", line 294, in app
(ServeReplica:default:APIIngress pid=9170)     raw_response = await run_endpoint_function(
(ServeReplica:default:APIIngress pid=9170)   File "/home/ec2-user/aws_neuron_venv_pytorch/lib64/python3.8/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
(ServeReplica:default:APIIngress pid=9170)     return await dependant.call(**values)
(ServeReplica:default:APIIngress pid=9170)   File "/home/ec2-user/./inference_sd.py", line 38, in generate
(ServeReplica:default:APIIngress pid=9170)     image = await image_ref
(ServeReplica:default:APIIngress pid=9170) TypeError: object Image can't be used in 'await' expression
(ServeReplica:default:APIIngress pid=9170) INFO 2024-01-16 16:46:30,563 default_APIIngress LcGOVN 7f82dbfd-bc29-40e6-9326-16029bd06b22 /imagine replica.py:772 - __CALL__ ERROR 14203.6ms

ratnopamc commented 9 months ago

I was able to fix this issue by modifying my inference script. Please let me know/comment if you're open to accept PRs. I'm happy to create a PR to add an example of running on aws neuron with rayserve.

yummydsky commented 8 months ago

I face the same issue when I use the example provided by this document.

https://docs.ray.io/en/latest/cluster/kubernetes/examples/stable-diffusion-rayservice.html error log

ERROR 2024-03-12 02:02:07,778 stable_diffusion_APIIngress yjYoZj 5780f284-bcaa-4d0e-93b4-7a7856ab7428 /imagine replica.py:756 - Request failed due to RayTaskError:
Traceback (most recent call last):
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/replica.py", line 753, in wrap_user_method_call
    yield
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/replica.py", line 914, in call_user_method
    raise e from None
ray.exceptions.RayTaskError: ray::ServeReplica:stable_diffusion:APIIngress() (pid=235, ip=10.244.32.103)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/utils.py", line 165, in wrap_to_ray_error
    raise exception
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/replica.py", line 895, in call_user_method
    result = await method_to_call(*request_args, **request_kwargs)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/serve/_private/http_util.py", line 408, in __call__
    await self._asgi_app(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/fastapi/applications.py", line 1115, in __call__
    await super().__call__(scope, receive, send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/ray/anaconda3/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/ray/anaconda3/lib/python3.8/site-packages/fastapi/routing.py", line 274, in app
    raw_response = await run_endpoint_function(
  File "/home/ray/anaconda3/lib/python3.8/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/tmp/ray/session_2024-03-12_01-41-17_429681_8/runtime_resources/working_dir_files/https_github_com_ray-project_serve_config_examples_archive_d6acf9b99ef076a1848f506670e1290a11654ec2/stable_diffusion/stable_diffusion.py", line 27, in generate
    image = await image_ref
TypeError: object Image can't be used in 'await' expression
INFO 2024-03-12 02:02:07,779 stable_diffusion_APIIngress yjYoZj 5780f284-bcaa-4d0e-93b4-7a7856ab7428 /imagine replica.py:772 - __CALL__ ERROR 50383.7ms

sudhanshu456 commented 5 months ago

@ratnopamc could you share how did you fixed it?

rfan-debug commented 1 month ago

I think example code somehow let the ActorRef to be awaited twice, which leads to the runtime error you see. One await dereferenced the ActorRef (aka. future).

        image_ref = await self.handle.generate.remote(prompt)
        image = await image_ref

The existing doc has fixed this problem by using

        image = await self.handle.generate.remote(prompt)

ray-project / serve_config_examples

Stable Diffusion example config results in error #6