Open FurkanGozukara opened 1 week ago
this doesnt make sense to me because we are able to use FP8 mode of FLUX model and T5 XXL when using FLUX with ComfyUI
please add a support to your pipe to support FP8 running transformer and T5 XXL it is totally doable
@zRzRzRzRzRzRzR @wenyihong @chenxwh
so that pipeline_cogvideox_image2video can run fast on 24 GB GPUs on Windows
currently it is mandatory to use cpu offloading which is totally overkill thank you
This appears to be an issue with T5. Additionally, if your GPU is sufficient, you can remove all configurations of cpu_offload and use pipe.to("cuda"). The current FP8 is implemented through torchao with fp8 weights for BF16 inference. If you want to use E4M3 inference, some adjustments will likely be needed. During my testing, I encountered an error similar to yours. I have contacted diffusers about this error, and I suspect there are some incompatibilities within the underlying libraries of torchao, torch, and diffusers. We will attempt to address this in the future, but this work may need to be completed by the community, as we currently do not have enough manpower.
This appears to be an issue with T5. Additionally, if your GPU is sufficient, you can remove all configurations of cpu_offload and use pipe.to("cuda"). The current FP8 is implemented through torchao with fp8 weights for BF16 inference. If you want to use E4M3 inference, some adjustments will likely be needed. During my testing, I encountered an error similar to yours. I have contacted diffusers about this error, and I suspect there are some incompatibilities within the underlying libraries of torchao, torch, and diffusers. We will attempt to address this in the future, but this work may need to be completed by the community, as we currently do not have enough manpower.
thank you so much for reply
i have 24 GB and without optimizations it use 26 GB - i tested on cloud
I opened an issue on Diffusers as well - we are able to use T5 and FLUX in FP8 so i think CogVideo must be same
I have opened a detailed issue here anyone has any ideas? happens when FP8 used
https://github.com/huggingface/diffusers/issues/9539