Open G-370 opened 3 weeks ago
isn't it some kind of "ipadapter" style option ?
Somewhat related, targeting specific blocks of UNET (i.e. out0 and out1) is demonstrated in https://youtu.be/0ChoeLHZ48M . It is not clear for me why training is needed for single-image dataset.
Are there any trained example B-Lora model files available for testing inference? I assume the author would include a bounch of them since they are so easy to train and so tiny, but no, I couldn't find one.
=== [update] one training completed after 15 mins, resulting this:
-rw-r--r-- 1 liusida liusida 55M Jun 15 19:32 optimizer.bin
-rw-r--r-- 1 liusida liusida 108M Jun 15 19:32 pytorch_lora_weights.safetensors
-rw-r--r-- 1 liusida liusida 14K Jun 15 19:32 random_states_0.pkl
-rw-r--r-- 1 liusida liusida 988 Jun 15 19:32 scaler.pt
-rw-r--r-- 1 liusida liusida 1000 Jun 15 19:32 scheduler.bin
It works. Interesting. I think I can understand the inference part, so I'll make a custom node that can load B-Lora models and choose to adopt its Style
or Content
.
working in progress: https://github.com/liusida/ComfyUI-B-LoRA
========= Here is my first attempt of training two B-Loras and generating a result using those two.
A B-Lora for style:
Another B-Lora for content:
The result:
ok, done. A light-weight custom node for loading B-LoRA models:
怎么训练呢
https://b-lora.github.io/B-LoRA/
I think B-LoRA is an extremely interesting concept, its not just a new way out of a thousand ways to train any kind of lora...
Nope, it's specifically a LoRA that targets only blocks OUT0 and OUT1 of the UNet of SDXL, and from very brief training AND A SINGLE IMAGE, it learns a decoupled representation of "style x content" of the image... that means we are able to make absurdly tiny loras out of single images, for then applying that content of the image IN OTHER styles. I don't get why the community hasn't already got a whiff of this.
If this gets well spread and used, it means that users of pony and users of sdxl in general won't need massive loras of many x ranks that applies to every single block of UNet in order to get the character (content) or style they want.