kamalkraj / stable-diffusion-tritonserver

Deploy stable diffusion model with onnx/tenorrt + tritonserver
Apache License 2.0
122 stars 19 forks source link

failed to load 'stable_diffusion' version 1: #2

Closed whatsondoc closed 1 year ago

whatsondoc commented 1 year ago

Hello,

Following instructions to deploy this project, and observing that Triton is unable to load the stable_diffusion model.

This is seen in the Triton Server logs printed to stdout:

1028 08:21:03.012132 581 pb_stub.cc:309] Failed to initialize Python stub: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'

At:
  /models/stable_diffusion/1/model.py(58): initialize

I1028 08:21:03.465850 1 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: encoder (GPU device 1)
E1028 08:21:03.470367 1 model_lifecycle.cc:596] failed to load 'stable_diffusion' version 1: Internal: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'

At:
  /models/stable_diffusion/1/model.py(58): initialize

The specific function referenced in model.py is here (line 58, indicated below):

    def initialize(self, args: Dict[str, str]) -> None:
        """
        Initialize the tokenization process
        :param args: arguments from Triton config file
        """
        current_name: str = str(Path(args["model_repository"]).parent.absolute())
        self.device = "cpu" if args["model_instance_kind"] == "CPU" else "cuda"
        self.tokenizer = CLIPTokenizer.from_pretrained(current_name + "/stable_diffusion/1/")
        self.scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
        self.scheduler = self.scheduler.set_format("pt")   <--
        self.height = 512
        self.width = 512
        self.num_inference_steps = 50
        self.guidance_scale = 7.5
        self.eta = 0.0

I tried commenting this line out so self.scheduler is only defined in the line previous, and Triton Server starts and all models (including stable_diffusion) loads successfully and is reported by Triton as online and ready.

Leaving this in place, when subsequently working through the Jupyter Notebook, an error is raised (somewhat expectedly):

InferenceServerException: Failed to process the request(s) for model instance 'stable_diffusion', message: Stub process is not healthy.

So, forced back to the original issue - have you seen this before, or any idea on a fix?

lolagiscard commented 1 year ago

Same issue on my side. Testing on a V100 using pip install --upgrade diffusers (0.7.2) "failed to load 'stable_diffusion' version 1: Internal: AttributeError: 'LMSDiscreteScheduler' object has no attribute 'set_format'" Which version of Diffusers library are you using for this demo?

EDIT : with a previous version of diffusers (0.3.0) it's loading the model without error. But then, still the other error E1107 15:43:31.899870 116 python_be.cc:1818] Stub process is unhealthy and it will be restarted.

What do you think this comes from ? Thanks for your help

kamalkraj commented 1 year ago

Try now

https://github.com/kamalkraj/stable-diffusion-tritonserver/commit/aa52e4c6dba6179be5c3d421e2873a12db0ebef7

lolagiscard commented 1 year ago

Unfortunately still not working when launching an Inference. The server is however fixed with this new diffusers version. E1108 12:53:16.125181 95 python_be.cc:1818] Stub process is unhealthy and it will be restarted. ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy. What Hardware are you testing on ? Thanks for the support.

whatsondoc commented 1 year ago

I'm seeing the same, I'm afraid.

The Triton container build & server launch run smoothly, however when whizzing through the Inference notebook I get to stage #7, which produces the InferenceServerException: Failed to process the request(s) for model stable_diffusion, message: stub process is not healthy (which is also populated in the TritonServer log).

For reference, I'm trying this on a DGX-2 with 16 x V100-SXM3-32GB.

kamalkraj commented 1 year ago

Hardware tested 1080Ti

kamalkraj commented 1 year ago

@whatsondoc @lolagiscard Could you please share screenshot/logs ?

lolagiscard commented 1 year ago

Server Part (before inference) image

Inference part : image

Server after inference: image

whatsondoc commented 1 year ago

Sure, here's a few screenshots (let me know if you'd like the full logs, which would take a bit to get them out of the environment but it's doable).

The TritonServer logs were screenshotted after the Inference call was made.

bild bild

kamalkraj commented 1 year ago

Try running the docker with below cmd

docker run -it --rm --gpus device=0 -p8000:8000 -p8001:8001 -p8002:8002 --shm-size 16384m   \
-v $PWD/stable-diffusion-v1-4-onnx/models:/models tritonserver \
tritonserver --model-repository /models/
lolagiscard commented 1 year ago

I'm already testing testing on 1 GPU only, if this is the change you want us to try. There might be something wrong in the nvidia Triton docker itself, might not work with some of the GPU architectures

kamalkraj commented 1 year ago

Okay. I will test it on v100 and let you know.

@lolagiscard @lolagiscard running on v100 ?

lolagiscard commented 1 year ago

Great , thanks let us know Yes Im also on a V100 sorry should have said that already above :)

kamalkraj commented 1 year ago

I was able to reproduce the issue with torch version 1.13

I have pinned torch 1.12.1 in docker and fixed the issue. https://github.com/kamalkraj/stable-diffusion-tritonserver/commit/666e1486780b050e18ab56b445d2310e54773b0d

lolagiscard commented 1 year ago

All good now, in deed ! Thanks a lot :)

whatsondoc commented 1 year ago

Nice - thanks kamalkraj! Works like a charm.

One observation is that I needed to reduce this to run on a single GPU, when using more than one (originally I tried with 4) I get the following (screenshot attached).

As mentioned though, with a single GPU it works great, appreciate the support.

bild

kamalkraj commented 1 year ago

I will check the multi-gpu issue.

Please checkout v2

let me know of any issues

whozwhat commented 1 year ago

I also encountered the same issue when I use multiple gpu and checked out the v3 branch,hope this screenshot helps in any way

debug
kamalkraj commented 1 year ago

@whozwhat Multi-GPU not yet fixed run the docker using below cmd should fix the issue for now

docker run -it --rm --gpus device=0 -p8000:8000 -p8001:8001 -p8002:8002 --shm-size 16384m   \
-v $PWD/models:/models sd_trt bash
whozwhat commented 1 year ago

Thanks Reply, this cmd works. It would be awesome if the multi-gpu issue could be fixed