Any plans to support torchao quantization?

kijai / ComfyUI-CogVideoXWrapper

335 stars 21 forks source link

Any plans to support torchao quantization? #50

Open caojiachen1 opened 1 week ago

caojiachen1 commented 1 week ago

The cogvideox model uses torchao as its official quantization method which can achieve a great balance between inference speed and output video quality. The current fp8 quantization seems to incur significant quality loss especially when contrasted with torchao method. I tried to modify the custom node to support torchao but somehow the program always run into an exception which really makes me frustrated.

kijai commented 1 week ago

I actually tested it today, the bigger issue is that torchao latest versions has to be compiled to install in Windows, which isn't all that simple. Similarly torch.compile requires Triton (Linux only), and on Linux I find onediff to be much faster anyway.

I had int4 torchao running on windows after some trouble, it used around 12GB VRAM and speed was very slow 9s/it.