Closed xiaoyan1995 closed 5 months ago
I think most loras are compatible, except those with trigger words. We are aware of this issue.
Loras are fed CLIP but there's no re-entry for CLIP. Do we only include LORA through text calls?
I tried taking CLIP from model loader, feeding it into LORA, then a standard CLIP Text (empty) & combining them with the conditioning feed from ELLA before the SAMPLER
Not sure if that works or does anything. I can see a difference as the LORA influences but I don't know if the keyworks make anything happen
A node has been reserved for converting clip continuation to ella embeds. But for the clip condition to work, we have to wait for the release of the next version of ella, because the current ella only accepts t5 embeds.
Loras are fed CLIP but there's no re-entry for CLIP. Do we only include LORA through text calls?
I tried taking CLIP from model loader, feeding it into LORA, then a standard CLIP Text (empty) & combining them with the conditioning feed from ELLA before the SAMPLER
Not sure if that works or does anything. I can see a difference as the LORA influences but I don't know if the keyworks make anything happen
It looks like you are correct. The temporary solution for now is to directly concatenate the outputs of CLIP and ELLA.
CLIP -> Bx77x768; T5 -> BxNx2048 -> ELLA -> Bx64x768; Then we use concat([Bx64x768, Bx77x768], dim=1)
as the input of CrossAttention Layers in UNet.
@Duemellon We have released a new version and added an example workflow for lora.
Is there any way to connect to Lora?