AIFSH / OmniGen-ComfyUI

MIT License
164 stars 19 forks source link

Running in Cloud - Ultra - Stand Alone app 30 Seconds - Comfyui Mode 130 Seconds per run (2nd run) #5

Open adamreading opened 3 weeks ago

adamreading commented 3 weeks ago

Hey guys

I am really keen to try and incorporate Omnigen into my workflow - and I am super grateful that you made a node.

But I don't understand why running your standard Text to Image workflow - takes 130+ seconds every run ( and that was set to 512 x 576 size) when it only takes 30 seconds a run in the Stand Alone Omnigen at 1024x1024 which is also loaded onto my cloud service on same GPU...

Is there some setting or something that would make it so much more inefficient. For me it takes the cost per run from 2 cents up to 7-10 cents and thats just on small size output?

Many thanks

0X-JonMichaelGalindo commented 3 weeks ago

This node apparently unloads the model from VRAM after each generation, and then reloads it before executing the next generation. As far as I can tell, that is where the slowdown is happening. I do not know how to resolve it.

Edit: Keeping the model in VRAM sped up generation time for me. It may resolve your issue as well, although I do not know. You can see the modified code at my fork. https://github.com/0X-JonMichaelGalindo/OmniGen-ComfyUI

adamreading commented 3 weeks ago

This node apparently unloads the model from VRAM after each generation, and then reloads it before executing the next generation. As far as I can tell, that is where the slowdown is happening. I do not know how to resolve it.

Edit: Keeping the model in VRAM sped up generation time for me. It may resolve your issue as well, although I do not know. You can see the modified code at my fork. https://github.com/0X-JonMichaelGalindo/OmniGen-ComfyUI

That’s amazing will load in a sec - currently building out an AI Hub for sharing/learning with some imaging experts and others - we even have FluxBots where you can collaboratively build images and iterate them lol - you would be really welcome to come help us! https://discord.gg/TWGS4BYM

adamreading commented 3 weeks ago

This node apparently unloads the model from VRAM after each generation, and then reloads it before executing the next generation. As far as I can tell, that is where the slowdown is happening. I do not know how to resolve it.

Edit: Keeping the model in VRAM sped up generation time for me. It may resolve your issue as well, although I do not know. You can see the modified code at my fork. https://github.com/0X-JonMichaelGalindo/OmniGen-ComfyUI

Right now it’s seems to be taking longer and reloading still. I just git loaded the new fork over the old node in the folder, should I have done anything else - what settings are you using? (Seems to be loading the safetensors twice in the logs, one after the other)

adamreading commented 3 weeks ago

OmniGen Forked ComfyUI

Text to Image

Ultra 1st Run - 313 seconds 1024 x1024 50 Steps Separate - true Offload - False

2nd run 119.85 3rd run 179.27 4th run (swapped node) - 288

Attempt 2 -turned off Separate

1024 x1024 50 Steps Separate - false Offload - False

1st Run - Ram 60% - 320 seconds 2nd Run - Ram 60% - GOPU 90% VRAM 22%

Reset to original fork and node 1st Run - 189.46 2nd 140.14

adamreading commented 3 weeks ago

This node apparently unloads the model from VRAM after each generation, and then reloads it before executing the next generation. As far as I can tell, that is where the slowdown is happening. I do not know how to resolve it.

Edit: Keeping the model in VRAM sped up generation time for me. It may resolve your issue as well, although I do not know. You can see the modified code at my fork. https://github.com/0X-JonMichaelGalindo/OmniGen-ComfyUI

When you run your fork locally - is it trying to load the model twice for you too?

adamreading commented 2 weeks ago

For anyone interested the latest update to add latent etc - has fixed all my cloud running issues and and its working fine - there’s a shared template to launch it fully ready to go at https://mimicpc.com/?fpr=adam47

king848 commented 2 weeks ago

I've separated the loadmodel module so that only the first load safetensors are needed, and the rest is only time for inference.see https://github.com/king848/OmniGen-ComfyUI