Open Razunter opened 5 months ago
I have addressed the Attention mode. However, it cannot support the Latent mode because Forge does not support the AND syntax that serves as the foundation for Latent mode.
@hako-mikan Maybe make a feature request to Forge? I can make one, but without technical details it would be pointless.
@hako-mikan
sorry for suddenly showing up here, but someone sent me a twitter PM and linked me to here.
because Forge does not support the AND
note that forge does supports AND
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/169#discussioncomment-8426825
Note that the software that does not support AND
is comfyui and we are not comfyui.
Please let us know if there is anything missing in Forge's UNet Patcher system
Ah, it seems I was mistaken. Indeed, the AND syntax can be used. What I meant to discuss was the Batch_cond_uncond option used when processing with the AND syntax. Furthermore, there are issues with the LoRA processing. When executing the Regional Prompter's process, the code in Forge makes the steps to achieve the desired outcome by the user through the Regional Prompter very complicated.
The Regional Prompter has two modes. The Attention mode divides the region in the Attention layer and applies separate prompts. This is done within Unet and is achievable as of now. The problem lies in the Latent mode, which processes after exiting Unet. In Latent mode, using the AND syntax, it divides and attaches the cond created from the prompts divided by AND, region by region, after passing through Unet. This operation involves several complex processes.
The reason for doing this is to select whether to apply LoRA to each region for multiple characters. For example, apply LoRA1 when denoising the left side and LoRA2 when denoising the right side. For instance, when inferring with a prompt set of dog AND cat and a negative prompt of worst quality, two cond and one uncond computations are performed. Normally, these calculations are done in batch, so it's not possible to apply different LoRAs. The Web-UI has a batch_cond_uncond option, which allows all calculations to be done separately. This is originally used to reduce VRAM but is being used to apply different LoRAs. That is, calculations are done in the order of cond1, cond2, and uncond. After this processing, the denoising calculation is performed. This is processed by the cfg code in the web-ui.
Additionally, processing with a batch size greater than two is complicated, as it involves shifting the position of the cond to apply the same LoRA to the same region. Furthermore, there are difficulties in applying LoRA. In the current code, LoRA is integrated into the model beforehand. This makes it difficult to choose later whether to apply or not apply LoRA for each. As mentioned earlier, it's necessary to apply different LoRAs for cond1, cond2, and uncond. The Web-UI offers two ways to apply LoRA, similar to diffusers, where you can choose to add LoRA's weight at inference time. If the LoRA's weight is added in this manner, it's possible to choose whether to apply it for each cond calculation. In the current code, it's necessary to reset the lora's weight for each cond switch and apply a different LoRA, which takes a considerable amount of time.
For these reasons, I have decided to forego the latent mode in Forge. Currently, I'm busy with my main job, so I don't have time to thoroughly address these issues. If you wish to maintain compatibility with the Web-UI, I would appreciate it if you could make the following changes:
Implement an option to perform separate calculations for each cond. Here, when the batch size increases, it would be helpful if you could bundle the batch for each cond. That is, if the batch size is three, then batch 1, 2, 3 of cond1 would be included, and three calculations of cond1, cond2, and uncond would be performed.
Change the application method of LoRA to calculate on the fly, similar to diffusers. For this, it would be beneficial if you could set the strength value of LoRA and change all strengths in a single calculation.
this is the code for each lora modules
def forward(self, x, scale = None):
return (
self.org_forward(x)
+ self.lora_up(self.lora_down(x)) * self.multiplier * self.scale
)
this is the code for LoRAnetwork
def set_multiplier(self, num):
for lora in self.unet_loras + self.te_loras:
lora.multiplier = num
Additionally, it is necessary to make it possible to select whether or not to apply LoRA for each cond calculation in the TextEncoder. Moreover, there are various types of LoRA, and accommodating all of them is very challenging.
Due to the significant changes required, it might be better to focus on other elements at this time. The development of Forge is still in its early stages, and there seem to be many other priorities.
Since I'm getting this translated by ChatGPT, there might be parts where the intent doesn't come across clearly. If you have any questions, please feel free to ask.
thanks! i will take a look soon and try to give more hacks to lora compute
Seems like everyone is moving to forge but Lora features are still severely hampered in the current state. Pity
Seems like everyone is moving to forge but Lora features are still severely hampered in the current state. Pity
They'll only be there temporarily. Looks like the goal of Forge is to keep making improvements until they run out of ways to make it better than vanilla Automatic1111 AND all of their optimizations are implemented in vanilla automatic. I'm probably messing that up somewhat, but here's more: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/166
Seems like everyone is moving to forge but Lora features are still severely hampered in the current state. Pity
not really, forge changed too much that not all a1111 extensions works on them so I still use A1111 (forge extensions don't always work on a1111 too, since they use comfyUI model patcher)
If i'm gonna break extension for the sake of performance I'll just use comfyUI instead, maximum speed and maximum customization
Seems like everyone is moving to forge but Lora features are still severely hampered in the current state. Pity
This is my current workaround with Forge.
Currently, Regional Prompter doesn't work with Forge fork of WebUI, would be great to support it.
Attention mode:
Latent mode: