Open SLAPaper opened 6 months ago
I remember trying it but couldn't get it to work with the diffusers provided T5 model. I'd have to somehow run the inference in 32bit/16bit while keeping the weights in the 8bit format. That'd involve re-implementing T5 from scratch, I think?
As for the model itself, that most likely doesn't need it since it's only around ~1.2 GB in FP16.
the fp8 storage type is introduced in https://github.com/comfyanonymous/ComfyUI/issues/2157 which significantly reduce vram usage, so I wonder whether pixart models have supported yet?