liusida / ComfyUI-B-LoRA

A ComfyUI custom node that loads and applies B-LoRA models.
MIT License
66 stars 5 forks source link

I can't explain why, the same model, the same bottom model, the same prompt words, the use of this custom node and the degree of HF online graph style transfer gap is very far, may I ask why #8

Open kenic123 opened 3 months ago

kenic123 commented 3 months ago

我说不清为什么,同一模型,同一底模,一模一样的提示词的情况下,用这个自定义节点与HF在线生图风格迁移的程度差距很远,请问这是为什么?另外就是长提示词会减弱LORA的效力 I can't explain why, the same model, the same bottom model, the same prompt words, the use of this custom node and the degree of HF online graph style transfer gap is very far, may I ask why?In addition, long prompt words can weaken LORA's effectiveness 选区保存 957e51f5a8e0c4bf6762595fcbfb3ef c333c5f26e69703d8e3147162d38b81

kenic123 commented 3 months ago

Very much looking forward to your reply

kenic123 commented 3 months ago

I'm very optimistic about this program

kenic123 commented 3 months ago

我用huggingface跑的,提示词是: A boy is playing football. A [v30],下面是链接: https://huggingface.co/lora-library/B-LoRA-pen_sketch?text=A+boy+is+playing+football+A+%5Bv30%5D 结果是: image 我在本地用节点跑的结果是: image 而且随着提示词长度增加,LORA的效果也就越差

kenic123 commented 3 months ago

image 试了很多长,依旧如此

liusida commented 3 months ago

I've tried your example, and lowered the CFG parameter to 2.0:

image

liusida commented 3 months ago

workflow

maepopi commented 2 months ago

Hey there! I have the same problem. I can't seem to get the B-LoRAS to work in Comfy UI while at the same time they work in a notebook where I use diffusers native API and integrate B-Loras code in it. Here are some images

First the style trained : style_vangogh_008

The result I have on my notebook cat_vangogh_notebook

The result I have on Comfy : as you can see it is very far from the desired style, and the seed is the same cat_vangogh_comfy

Here's my setup in Comfy brave_nvNdxwq3gA tachments/assets/7e60fa8f-8d6c-467d-b3e6-3b5b1bd9f6ab)

Could this simply be a question of parameters used for the inference? Or could it come from the conversion you're making inside the code, from diffusers to kohya format? Lowering CFG doesn't help much in this case

Thank you!

liusida commented 2 months ago

Hey there! I have the same problem. I can't seem to get the B-LoRAS to work in Comfy UI while at the same time they work in a notebook where I use diffusers native API and integrate B-Loras code in it. Here are some images

WOW! That's very different!

If possible, can you please help me setup the same environment? So that I can run debug those two methods side-by-side and find the source of difference.

Can you share your b-lora model, and your code running on notebook, and the png with workflow embeded (or maybe simply a json file)? You can attach these here.

Thank you for your time!

maepopi commented 2 months ago

Hello!

You know after some investigation I realized that the problem might be even deeper. When comparing a "Diffusers" basic workflow to a "ComfyUI" one, the result is completely different. So I think it's gonna be hard for you to debug this! In the meantime I've built upon this custom node which allows to adapt diffusers to comfy. I'll implement B-LoRAS there and see if it works better.

But still if you want to try and see if there's something wrong with your node, here are the info you asked:

The code of the notebook:

import torch
from diffusers import StableDiffusionXLPipeline, AutoencoderKL, AutoPipelineForText2Image, EulerDiscreteScheduler
from blora_utils import BLOCKS, filter_lora, scale_lora

def load_b_lora_to_unet(pipe, 
                        content_lora_model_id: str = '', 
                        style_lora_model_id: str = '', 
                        content_alpha: float = 1.,
                        style_alpha: float = 1.) -> None:
        try:
            # Get Content B-LoRA SD
            if content_lora_model_id:
                content_B_LoRA_sd, _ = pipe.lora_state_dict(content_lora_model_id)
                content_B_LoRA = filter_lora(content_B_LoRA_sd, BLOCKS['content'])
                content_B_LoRA = scale_lora(content_B_LoRA, content_alpha)
            else:
                content_B_LoRA = {}

            # Get Style B-LoRA SD
            if style_lora_model_id:
                style_B_LoRA_sd, _ = pipe.lora_state_dict(style_lora_model_id)
                style_B_LoRA = filter_lora(style_B_LoRA_sd, BLOCKS['style'])
                style_B_LoRA = scale_lora(style_B_LoRA, style_alpha)
            else:
                style_B_LoRA = {}

            # Merge B-LoRAs SD
            res_lora = {**content_B_LoRA, **style_B_LoRA}

            # Load
            pipe.load_lora_into_unet(res_lora, None, pipe.unet)
        except Exception as e:
            raise type(e)(f'failed to load_b_lora_to_unet, due to: {e}')

style_LoRA_path = "path/to/safetensors "
content_LoRA_path = ""
content_alpha, style_alpha = 1,1.1
pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

load_b_lora_to_unet(pipeline, content_LoRA_path, style_LoRA_path, content_alpha, style_alpha)

prompt = 'A cat in the style of [v03]'
generator = torch.Generator(device='cuda').manual_seed(2)
image = pipeline(prompt=prompt, num_inference_steps=20, generator=generator).images[0].resize((512,512))

image

Now the comfyUI setup workflow.json

And the B-Lora https://we.tl/t-CNgvcWPisk

Hope it helps, let me know what you can find!!

liusida commented 2 months ago

Thank you @maepopi . Good to know!

I think I have once heard Comfy, the author, commenting that diffuser's implementation is not that good. I guess that Comfy has changed some mechanisms. Ha-ha.

I would like to dig in to see how comfy and diffuser differ as well.

maepopi commented 2 months ago

Hey @liusida! Yeah I read that as well, that they thought Diffusers library was a mess. From what I observed in their code, they picked every brick apart and rebuilt them together in a different way! I've tried implementing B-LoRAS in my custom diffusers to comfy node, but for now I have a few issues as well, don't hesitate to have a look https://github.com/maepopi/Diffusers-in-ComfyUI. I've implemented pretty much the same code as in my notebook

liusida commented 2 months ago

Hey @liusida! Yeah I read that as well, that they thought Diffusers library was a mess. From what I observed in their code, they picked every brick apart and rebuilt them together in a different way! I've tried implementing B-LoRAS in my custom diffusers to comfy node, but for now I have a few issues as well, don't hesitate to have a look https://github.com/maepopi/Diffusers-in-ComfyUI. I've implemented pretty much the same code as in my notebook

I really like different approaches to the same problem, so I'll definitely give a thumbs up to your effort!

Meanwhile, I am trying to learn from ComfyUI and build another system so that the browser doesn't submit the entire graph to the server. Instead, each node submits to the server independently, and the workflow is controlled in the browser. I call it StoneSoup. For now, I am using nodes from Comfy to do the heavy lifting, but I am looking forward to integrating other node packs that use different approaches, such as diffusers.

The goal of StoneSoup is to facilitate AI experimentation, such as sweeping through hyperparameters and so on. Please check it out if you are interested.

liusida commented 2 months ago

P.S. StoneSoup is still working-in-progress, so expect dirty implementations and bugs. Ha-ha!

maepopi commented 2 months ago

Oh wow thank you very much! I'm not very well versed in server management actually, I'm more of a local software engineer haha but I love to learn! I'll check it out :) Let's keep in touch about our respective investigations :) On my side, for now, I'm struggling to reproduce the same results of my notebook even with my diffusers-to-comfy pipeline. But this time, I've made sure that a generic diffusion pipeline outputs the same thing on my notebook and inside ComfyUi, so that's a start!

maepopi commented 2 months ago

Hey again, I've finally succeeded in reproducing the same result as my notebook inside Comfy UI using the diffusers pipeline and the original B-LoRA code. I've pushed the changes in my repo, feel free to take a look if you like. At this stage I can say that I'm pretty certain the difference of the base building blocks between comfy and Diffusers is a lot to blame for the differences in your plugin! I've also noticed that in the inference code, when you write : image = pipeline(arguments)

If you have "negative_prompt" in your arguments, even if your negative prompt is an empty string, it causes a massive impact on the image anyway. Good to know!

Anyway I'll keep an eye on this thread to see if you found how to mitigate the differences in your plugin!

liusida commented 2 months ago

I've also noticed that in the inference code, when you write : image = pipeline(arguments)

If you have "negative_prompt" in your arguments, even if your negative prompt is an empty string, it causes a massive impact on the image anyway. Good to know!

Do you mean in the diffusers inference code? Is it a good impact or a negative impact?

maepopi commented 2 months ago

It's still unclear to me because I haven't tested it thoroughly with Comfy API, in this case I'm talking about diffusers. And well in this case it completely breaks the B-LoRA effect