Open shimizust opened 9 months ago
Hey! Sure, thanks.
Uh in theory this should be pretty easy but I've never tried it personally :sweat_smile: Don't forget that if you're deploying your model somewhere else you'll have to include the model in your Docker build.
So, I think it should be this simple:
RUNTIME_DOWNLOADS=0
MODEL_ID
to the directory containing your local model (in diffusers format).MODEL_PRECISION="fp16"
if relevant.That's assuming you only want one model loaded per run. If you want to be able to switch models at runtime, you can just pass MODEL_ID
(to the directory containing the local model) as part of your request. You may need to also set RUNTIME_DOWNLOADS=1
but first try without that.
Hope that helps! Let me know either way.
Thanks for the tips @gadicc !
I tried this using https://huggingface.co/CompVis/stable-diffusion-v1-4 in a volume mounted to the container. I think it's loading the model fine, but I'm getting an error during inference. Any ideas what I'm doing wrong?
data = {
"modelInputs": {
"prompt": "Super dog",
"num_inference_steps": 5
},
"callInputs": {
"MODEL_ID": "/shared/user/sshimizu/stable-diffusion-v1-4",
"PIPELINE": "StableDiffusionPipeline",
"SCHEDULER": "DPMSolverMultistepScheduler",
"RUNTIME_DOWNLOADS": 0,
"MODEL_PRECISION": "fp16",
"safety_checker": "true",
},
}
json_data = json.dumps(data)
response = requests.post(url, json=data)
I get this error:
{"$error":{"code":"PIPELINE_ERROR","name":"TypeError","message":"unsupported operand type(s) for %: 'int' and 'NoneType'","stack":"Traceback (most recent call last):\n File \"\/api\/app.py\", line 638, in inference\n images = (await async_pipeline).images\n File \"\/opt\/conda\/lib\/python3.10\/asyncio\/threads.py\", line 25, in to_thread\n return await loop.run_in_executor(None, func_call)\n File \"\/opt\/conda\/lib\/python3.10\/concurrent\/futures\/thread.py\", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n File \"\/opt\/conda\/lib\/python3.10\/site-packages\/torch\/utils\/_contextlib.py\", line 115, in decorate_context\n return func(*args, **kwargs)\n File \"\/opt\/conda\/lib\/python3.10\/site-packages\/diffusers\/pipelines\/stable_diffusion\/pipeline_stable_diffusion.py\", line 1062, in __call__\n if callback is not None and i % callback_steps == 0:\nTypeError: unsupported operand type(s) for %: 'int' and 'NoneType'\n"}}
And here are the pod logs:
[2024-02-13 01:29:10 +0000] - (sanic.access)[INFO][127.0.0.1:36822]: POST http://localhost:8000/ 200 975
{
"modelInputs": {
"prompt": "Super dog",
"num_inference_steps": 5
},
"callInputs": {
"MODEL_ID": "/shared/user/sshimizu/stable-diffusion-v1-4",
"PIPELINE": "StableDiffusionPipeline",
"SCHEDULER": "DPMSolverMultistepScheduler",
"RUNTIME_DOWNLOADS": 0,
"MODEL_PRECISION": "FP16",
"safety_checker": "true"
}
}
download_model {'model_url': None, 'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'model_revision': None, 'hf_model_id': None, 'checkpoint_url': None, 'checkpoint_config_url': None}
loadModel {'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'load': False, 'precision': 'FP16', 'revision': None, 'pipeline_class': <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>}
pipeline <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>
Downloading model: /shared/user/sshimizu/stable-diffusion-v1-4
Keyword arguments {'use_auth_token': None} are not expected by StableDiffusionPipeline and will be ignored.
Loading pipeline components...: 57%|█████▋ | 4/7 [00:00<00:00, 7.00it/s]`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 7.70it/s]
Downloaded in 913 ms
2024-02-13 06:24:09.407129 {'type': 'loadModel', 'status': 'start', 'container_id': 'inference-server', 'time': 1707805449407, 't': 1511, 'tsl': 0, 'payload': {'startRequestId': None}}
loadModel {'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'load': True, 'precision': 'FP16', 'revision': None, 'pipeline_class': <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>}
pipeline <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>
Loading model: /shared/user/sshimizu/stable-diffusion-v1-4
Keyword arguments {'use_auth_token': None} are not expected by StableDiffusionPipeline and will be ignored.
Loading pipeline components...: 43%|████▎ | 3/7 [00:00<00:00, 10.72it/s]`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 8.13it/s]
Loaded from disk in 865 ms, to gpu in 811 ms
2024-02-13 06:24:11.083625 {'type': 'loadModel', 'status': 'done', 'container_id': 'inference-server', 'time': 1707805451084, 't': 3188, 'tsl': 1677, 'payload': {'startRequestId': None}}
Initialized StableDiffusionPipeline for /shared/user/sshimizu/stable-diffusion-v1-4 in 1ms
{'cross_attention_kwargs': {}}
2024-02-13 06:24:11.121568 {'type': 'inference', 'status': 'start', 'container_id': 'inference-server', 'time': 1707805451122, 't': 3226, 'tsl': 0, 'payload': {'startRequestId': None}}
{'callback': <function inference.<locals>.callback at 0x77dc81dfa830>, '**model_inputs': {'prompt': 'Super dog', 'num_inference_steps': 5, 'generator': <torch._C.Generator object at 0x77dc827abf50>}}
20%|██ | 1/5 [00:00<00:00, 12.56it/s]
[2024-02-13 06:24:11 +0000] - (sanic.access)[INFO][127.0.0.1:45132]: POST http://localhost:8000/ 200 975
Hey @shimizust
Looks like a bug... maybe because upstream diffusers removed a default, otherwise I'm not sure why we never saw this before.
Let me first explain the [most relevant lines of the] error and then the fix. You don't need to know or understand any of this, and feel free to skip if it's not of interest
Line: if callback is not None and i % callback_steps == 0
Error: unsupported operand type(s) for %: 'int' and 'NoneType'
So it's trying to calculate "x % y
" (modulo operation, i.e. if we divide x
by y
, what will the remainder be?).
Obviously this requires us to divide to numbers (integers), but in this case, it's warning that the second argument (callback_steps
) is not a number, it's a NoneType
(i.e. doesn't exist), and this is why we get the error.
Now as to what leads this error is a bit more complicated. In docker-diffusers-api, we automatically set a callback
(to be run on every callback_steps
) if none is provided. This used to work fine, but I guess now diffusers is expecting callback_steps
to be explicitly given if callback
is too).
So, the workaround (until I push a proper fix) is to provide a modelInput
called callback_steps
with an integer value, e.g.
{
"moduleInputs": {
// ...
"callback_steps": 20
}, // ...
}
This just controls how often we report back the current progress via webhook... if it's irrelevant for your application just use a number higher than your num_inference_steps.
Two other things I noticed (unrelated):
You included "RUNTIME_DOWNLOADS": 0
but this is something that's only recognized via an environment variable, not as part of the request.
You have safety_checker: "true"
but this should be a boolean
and not a string
, i.e. True
not "true"
. Not really sure how this will affect things but just to avoid any problems further down `:)
Good luck!
Thanks @gadicc ! Specifying "callback_steps" to some int in "modelInputs" works and I'm able to generate images from my local model now.
I guess setting RUNTIME_DOWNLOADS isn't strictly necessary then.
Hi, thanks for making this project available!
I was wondering if it is possible to point directly to local models? Instead of downloading from a URL or HF Hub?