Open lstein opened 3 months ago
@lstein Can you share the image outputs from v0.29.2 and v0.30.0?
By any chance do you have runwayml/stable-diffusion-v1-5
saved in your HF Cache directory?
@lstein Can you share the image outputs from v0.29.2 and v0.30.0?
My bad. The regression is present in 0.29.2 as well. The previous working version was 0.27.0. I have amended the bug report.
Here is the output from the script run with diffusers 0.27.0 vs 0.30.0. Also note the difference in image size. 0.27.0 apparently thinks this is an sd-2 model.
0.27.0 0.30.0
By any chance do you have
runwayml/stable-diffusion-v1-5
saved in your HF Cache directory?
Indeed yes. I've seen that from_single_file()
downloads it into the cache if it isn't there already. This seems to be the way it gets the component .json
config files for the base model of the checkpoint file being loaded.
Hi @lstein yes, we updated single file to rely on the model cache/configs to set up the pipleines. It enables us to support single file on a larger range for models. The prediction_type
argument is deprecated and will be removed eventually. Although we should show a warning here. I will open a PR for it.
I noticed that the scheduler in the repo you linked does contain a config that sets v_prediction
. You can configure your pipeline in the following way to enable correct inference.
from diffusers import StableDiffusionPipeline
import torch
model_id = 'https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors'
pipe = StableDiffusionPipeline.from_single_file(
model_id,
config="zatochu/EasyFluff",
torch_dtype=torch.float16,
).to("cuda")
prompt = "banana sushi"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("sushi.png")
I noticed that the scheduler in the repo you linked does contain a config that sets
v_prediction
. You can configure your pipeline in the following way to enable correct inference.
I'm a developer of InvokeAI, and am trying to support users who import arbitrary .safetensors
models, so it will be difficult to find a general mechanism to identify the diffusers model with a config that matches what the safetensors file needs. Can you suggest how to do this?
In most cases we can auto match to the appropriate config, provided that the .safetensors
file is in the original format and not the diffusers format. If you check the keys of the single file checkpoint and the diffusers checkpoints you will notice that the keys are different.
In this particular case you're setting the prediction_type
argument anyway since the YAML configs do not contain that information either.
You could configure a scheduler before hand with prediction type and set it in the pipeline.
e.g
from diffusers import StableDiffusionPipeline, DDIMScheduler
ckpt_path = "https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors"
pipe = StableDiffusionPipeline.from_single_file(ckpt_path)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, prediction_type="v_prediction")
print(pipe.scheduler.config.prediction_type)
from_single_file
operates on the assumption that you are trying to load a checkpoint saved in the original format. We could update/add a util function in diffusers.loader.single_file_utils
that raises an error if we can't match to an appropriate config . The current behaviour is to default to SD 1.5, which can be confusing.
Do you happen to have a list of models that would need to support these arbitrary .safetensors
files? Just so I understand your requirements a bit better?
the yaml file does specify the v_prediction
though https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.yaml#L5
should we consider adding a special check for this config when a yaml is passed? I think this is really an edge case where a fine-tuned checkpoint can have a different configuration from the base checkpoint
Ah my bad. Missed that. But even in earlier versions, we relied on the prediction_type
argument to configure the scheduler. It wasn't set from the YAML.
In the current version, setting via prediction_type
only works if
local_file_only=True
The reasoning was to encourage setting the prediction type via the Scheduler object and passing that object to the pipeline. Like we do for from_pretrained
. I think I missed this potential path during the refactor, so it is a breaking change. We can add additional checks for legacy kwargs and update the loading, but these kwargs are slated to be removed and this is a bit of an edge case. I would recommend following the same configuration process as from_pretrained
when doing single file loading and configuring the scheduler object before hand or using the config
argument.
@lstein can you let us know if the solution @DN6 proposed here works for you? https://github.com/huggingface/diffusers/issues/9171#issuecomment-2295704043
PR to address the current issue: https://github.com/huggingface/diffusers/pull/9229
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Describe the bug
There are a few Stable Diffusion 1.5 models that use a prediction type of
v_prediction
rather thanepsilon
. In version 0.27.0,StableDiffusionPipeline.from_single_file()
correctly detected and rendered images from such models. However, in version 0.30.0, these models are always treated asepsilon
, even when the correctprediction_type
andoriginal_config
arguments are set.Reproduction
You will need to download the original config file, EasyFluffV11.yaml into the current directory for this to work. After running, the file
sushi.png
will show incorrect rendering.Logs
Who can help?
@yiyixuxu @asomoza