Closed sayakpaul closed 6 months ago
Would be nice indeed. I would assume in the end most people are looking for or building something like a fully featured MegaPipeline with LoRAs, Prompt-Emphasis, Textual-Inversion, ControlNet, Blackjack and IP-Adapter.
yup. many pipelines can be moved to be methods instead - for example, how come enable_freeu
is a method that can be called on a pipeline and not a pipeline of its own?
i'd say primary candidates that don't need to be pipelines are the likes of StableDiffusionSAGPipeline
- its a normal pipeline that exposes one more tunable item.
What is the problem with just manually switching pipelines as follows:
from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe_img2img = StableDiffusionImg2ImgPipeline(**pipe.components)
?
I'm especially curious about why this would cause problems with cpu offload
a) different piplines may have same core components, but different per-pipeline components - and param validation in pipeline contructor throws error - so each pipeline has to be manually constructed.
b) just an example: load pipeline, enable model offload, set ipadapter. now switch pipeline to img2img and back to txt2img - chances are you end up with tensor location mismatch in unrelated part of the code - and most commonly in text_encoder (why text_encoder? because i'm builinding embeds from prompt, not passing prompt as-is to pipeline)
lib/python3.11/site-packages/torch/nn/functional.py:2264 in embedding
│ ❱ 2264 │ return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when
checking argument for argument index in method wrapper_CUDA__index_select)
Cc: @yiyixuxu
fyi, this is my attempt of a working switch_pipe
method: https://github.com/vladmandic/automatic/blob/f04fa75eaee262011817bab4a6e4ffb052c7e22c/modules/sd_models.py#L976
core is this part:
signature = inspect.signature(cls.__init__, follow_wrapped=True, eval_str=True)
possible = signature.parameters.keys()
if isinstance(pipeline, cls):
return pipeline
pipe_dict = {}
components_used = []
components_skipped = []
switch_mode = 'none'
if hasattr(pipeline, '_internal_dict'):
for item in pipeline._internal_dict.keys(): # pylint: disable=protected-access
if item in possible:
pipe_dict[item] = getattr(pipeline, item, None)
components_used.append(item)
else:
components_skipped.append(item)
new_pipe = cls(**pipe_dict)
yes im agree with that, for example i want on my dedicated server to alway let the pipeline loaded but if I'm working with different models its not currently possible without having to load few pipelines in parallel and that is consuming a lot of resources.
@vladmandic
I found a bug in the from_pipe
- once that's fixed (in this PR https://github.com/huggingface/diffusers/pull/6820) I'm able to reproduce the use case you described using from_pipe
with below script. Did I miss anything? Would you be able to provide a script that you would like for it to work but currently fails? I want to understand your use case fully so we can start improve from there :)
from diffusers import AutoPipelineForText2Image, AutoPipelineForImage2Image
import torch
from diffusers.utils import load_image
pipeline = AutoPipelineForText2Image.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter_sd15.bin")
pipeline.set_ip_adapter_scale(0.6)
pipeline.enable_model_cpu_offload()
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png")
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
prompt='best quality, high quality, wearing sunglasses',
ip_adapter_image=image,
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
num_inference_steps=50,
num_images_per_prompt=1,
generator=generator,
).images
images[0].save(f"out_1.png")
pipeline2 = AutoPipelineForImage2Image.from_pipe(pipeline)
pipeline = AutoPipelineForText2Image.from_pipe(pipeline2)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
prompt='best quality, high quality, wearing sunglasses',
ip_adapter_image=image,
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
num_inference_steps=50,
num_images_per_prompt=1,
generator=generator,
).images
images[0].save("out_2.png")
out1 | out2 |
---|---|
use case is that i load sd model and want to use different functionality conditionally. examples:
StableDiffusionSAGPipeline
StableDiffusionControlNetPipeline
and same applies to 10+ other functions that require 10+ different pipelines - and some of them are built-in pipelines and some are community pipelines.
I never want to load a model using a specific pipeline, in 90%+ cases, i load model using StableDiffusionPipeline.from_single_file
and want to reuse it only adding components as needed.
Thanks, @vladmandic
I think a lot of these use cases can be done using from_pipe
method of AutoPipeline, no? You can switch any SD pipeline to its Controlnet pipeline, with or without IP-adapter loaded, and vice versa .... However, we do not support switching between different pipelines, for example, we cannot switch between StableDIffusionSAGPipeline
and StableDiffusionPipeline
like you mentioned here.
We can probably support this by adding a from_pipe
method to DiffusionPipeline
. However, I would like to know more examples of different pipelines that you switch between- I imagine most of them are not compatible, no? e.g. You cannot create a Kandinsky pipeline from SD pipeline etc
use case is to reuse existing model components whenever possible. yes, that can only ever work if target pipeline is explicitly compatible with those components (so scenario such as SD15->Kandinsky should never work). but there are many pipelines that are based on SD15 or SDXL. you've mentioned that currently we cannot switch from StableDiffusionPipeline to StableDIffusionSAGPipeline and that is just one example. how about SD to PIAPipeline? AnimateDiffPipeline, etc.
@vladmandic
ahh, thanks! I think this makes a lot of sense to me:) I like the idea of extending the from_pipe
functionality to any compatible pipeline (i.e., different pipelines that share the same checkpoint and model components). I agree that we should make it super easy to switch and advocate creating pipelines this way.
also, I think with this, we should also be able to create separate pipelines for free_init
(see https://github.com/huggingface/diffusers/pull/6644). We can use free_init
with a simple API like this I really don't think it would hurt usage at all
pipe = AutoFreeInitPipeline.from_pipe(pipe_animatediff)
@vladmandic
would you be able to provide an example (with code)? 🥺 that will really help me understand the problem better. I tried and I can't get the same error
: load pipeline, enable model offload, set ipadapter. now switch pipeline to img2img and back to txt2img - chances are you end up with tensor location mismatch in unrelated part of the code - and most commonly in text_encoder (why text_encoder? because i'm builinding embeds from prompt, not passing prompt as-is to pipeline)
I designed a structure that would remove the need for that, make everything more flexible and completely reproducible. The way it works is that it uses the basic t2i pipeline divides it into a series of functions that go off in a predetermined sequence. To add the slight changes necessary to t2i in order to make i2i, you just add a new function and choose where it'll go in the sequence. Again, this removes the need to recreate an entirely new pipeline for a small change as it can easily be plugged in.
@yiyixuxu I think using the auto classes for this is the right approach
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I think not stale?
yeah not stable
i'm going to add a from_pipe
on DiffusionPipeline
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
completed with #7241
@yiyixuxu this is nice and all, but something that would be really nice is to be able to modify pipelines easily without creating new ones. For example: the difference between the regular t2i pipeline and i2i pipelines are a few lines of code injected. Same thing for controlnet and i2i controlnet. There are many functionalities that can be injected or not like inpainting, pix2pix and many more. Why not make the pipelines easily inject-able without having to recreate a new one entirely?
having separate pipelines in diffusers for features is somewhat cumbersome, especially since pipeline inheritance is less than ideal (e.g. why doesn't StableDiffusionImage2ImagePipeline inherit from StableDiffusionPipeline so I cannot check current model type easily?) and AutoPipeline is does not have full coverage and
.from_pipeline
even less.IMO, we need a cleaner way to switch pipelines for already loaded pipeline - right now I'm instantiating it manually using loaded pipeline components, but it does cause issues with model offloading and things like that).
Especially using community pipelines - I cannot load from scratch just to run one generate. I want to switch to it when I want to use specific feature and then switch back.
Originally posted by @vladmandic in https://github.com/huggingface/diffusers/issues/6318#issuecomment-1885425063