pamparamm / sd-perturbed-attention

Perturbed-Attention Guidance and Smoothed Energy Guidance for ComfyUI and SD Forge
MIT License
219 stars 14 forks source link

Perturbed-Attention Guidance and Smoothed Energy Guidance for ComfyUI / SD WebUI (Forge/reForge)

Implementation of Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance (D. Ahn et al.) and Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention (Susung Hong) as an extension for ComfyUI and SD WebUI (Forge) / SD WebUI (reForge).

Works with SD1.5 and SDXL.

[!NOTE] Paper and demo suggest using CFG scale 4.0 with PAG scale 3.0 applied to U-Net's middle layer 0, but feel free to experiment.

Sampling speed without adaptive_scale or sigma_start / sigma_end is similar to Self-Attention Guidance (x0.6 of usual it/s).

Installation

ComfyUI

You can either:

SD WebUI (Forge/reForge)

git clone https://github.com/pamparamm/sd-perturbed-attention.git into stable-diffusion-webui-forge/extensions/ folder.

SD WebUI (Auto1111)

As an alternative for A1111 WebUI you can use PAG implementation from sd-webui-incantations extension.

Guidance Nodes/Scripts

ComfyUI

comfyui-node-pag-basic

comfyui-node-pag-advanced

comfyui-node-seg

SD WebUI (Forge/reForge)

forge-pag

forge-seg

[!NOTE] You can override CFG Scale and PAG Scale/SEG Scale for Hires. fix by opening/enabling Override for Hires. fix tab. To disable PAG during Hires. fix, you can set PAG Scale under Override to 0.

Inputs

ComfyUI TensorRT PAG

To use PAG together with ComfyUI_TensorRT, you'll need to:

  1. Have 24GB of VRAM.
  2. Build static/dynamic TRT engine of a desired model.
  3. Build static/dynamic TRT engine of the same model with the same TRT parameters, but with fixed PAG injection in selected UNET blocks (TensorRT Attach PAG node).
  4. Use TensorRT Perturbed-Attention Guidance node with two model inputs: one for base engine and one for PAG engine.

trt-engines

trt-inference