JettHu / ComfyUI_TGate

T-GATE implementation for ComfyUI.
GNU General Public License v3.0
52 stars 8 forks source link
comfyui comfyui-nodes diffusion stable-diffusion t-gate

ComfyUI_TGate

English | 简体中文

ComfyUI reference implementation for T-GATE.

T-GATE could brings 10%-50% speed up for different diffusion models, only slightly reduces the quality of the generated images and maintains the original composition.

Some monkey patch is used for current implementation. If any error occurs, make sure you have the latest version.

If my work helps you, consider giving it a star.

Some of my other projects that may help you.

:star2: Changelog

:books: Example workflows

The examples directory has workflow example. There are images generated with and without T-GATE in the assets folder.

example

Origin result T-GATE result
origin_result tgate_result

T-GATE result image comes from the workflow included in the example image.

Compare to AutomaticCFG

AutomaticCFG is another ComfyUI plugin: Your CFG won't be your CFG anymore. It is turned into a way to guide the CFG/final intensity/brightness/saturation, and it adds a 30% speed increase.

env: T4-8G

Origin T-GATE 0.5 AutomaticCFG T-GATE 0.35 AutomaticCFG fatest
result origin_result tgate_result auto_cfg_boost tgate_0_35 auto_cfg_fatest
speed 4.59it/s 5.68it/s 5.62it/s 6.13it/s 6.13it/s

T-GATE performs best when maintaining the original composition. However, if you don't need to maintain composition, AutomaticCFG fatest also brings about the same performance improvement.

:green_book: INSTALL

git clone https://github.com/JettHu/ComfyUI_TGate
# that's all!

:orange_book: Major Features

:book: Nodes reference

TGate Apply

Inputs

Configuration parameters

TGate Apply Advanced

Inputs

Configuration parameters

Optional configuration

TGate Apply(Deprecated)

This node is already deprecated, and will be removed after few version.

Inputs

Configuration parameters

Optional configuration

:rocket: Performance (from T-GATE)

Model MACs Param Latency Zero-shot 10K-FID on MS-COCO
SD-1.5 16.938T 859.520M 7.032s 23.927
SD-1.5 w/ TGATE 9.875T 815.557M 4.313s 20.789
SD-2.1 38.041T 865.785M 16.121s 22.609
SD-2.1 w/ TGATE 22.208T 815.433 M 9.878s 19.940
SD-XL 149.438T 2.570B 53.187s 24.628
SD-XL w/ TGATE 84.438T 2.024B 27.932s 22.738
Pixart-Alpha 107.031T 611.350M 61.502s 38.669
Pixart-Alpha w/ TGATE 65.318T 462.585M 37.867s 35.825
DeepCache (SD-XL) 57.888T - 19.931s 23.755
DeepCache w/ TGATE 43.868T - 14.666s 23.999
LCM (SD-XL) 11.955T 2.570B 3.805s 25.044
LCM w/ TGATE 11.171T 2.024B 3.533s 25.028
LCM (Pixart-Alpha) 8.563T 611.350M 4.733s 36.086
LCM w/ TGATE 7.623T 462.585M 4.543s 37.048

The latency is tested on a 1080ti commercial card.

The MACs and Params are calculated by calflops.

The FID is calculated by PytorchFID.

:memo: TODO

:mag: Common promblem

2024.4.26-29 Updated on 2024.4.29
before_fixed after_fixed