lllyasviel / Fooocus

Focus on prompting and generating
GNU General Public License v3.0
41.13k stars 5.78k forks source link

TensorRT support #897

Open northfoxz opened 11 months ago

northfoxz commented 11 months ago

Is your feature request related to a problem? Please describe. SDXL is slow, we can speed up the process using TensorRT

Describe the idea you'd like Please support tensorRT, it can reduce the generation time by at least 2x

reference: Stable Diffusion WebUI TensorRT https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT

Thank you

kubilaykilinc commented 11 months ago

I used TensorRT with Automatic1111 - But it has very extreme contrast issues. and i deleted it. I do not know why it has this issue.

chengzeyi commented 11 months ago

@northfoxz Hi, friend!I know you are suffering great pain from using TRT with diffusers.

So why not choose my totally open-sourced alternative: stable-fast? It's on par with TRT on inference speed, faster than torch.compile and AITemplate, and is super dynamic and flexible, supporting ALL SD models and LoRA and ControlNet out of the box!

aquawaves commented 4 months ago

Long time no replies there, an interesting finding: Faceswap software called Rope (specifically Alucard's fork) provides a compatibility with TensorRT by setting it as an execution provider instead of CUDA:

self.providers = [ ( "TensorrtExecutionProvider", { "trt_engine_cache_enable": True, "trt_engine_cache_path": "tensorrt-engines", "trt_timing_cache_enable": True, "trt_timing_cache_path": "tensorrt-engines", "trt_dump_ep_context_model": True, "trt_ep_context_file_path": "tensorrt-engines", }, ), ("CUDAExecutionProvider"), ]

and only this allows to load a lot of models into VRAM without decreasing processing speed after TRT engines are made. Just a hint of some kind, idk.

zhangyuzhen1990 commented 2 months ago

We really need the support of TensorRT in Fooocus