nod-ai / SHARK-ModelDev

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
95 stars 48 forks source link

[model] Switch to hermetic built VAE FP16 model #769

Closed stellaraccident closed 4 months ago

stellaraccident commented 4 months ago

We've uploaded a hermetic built FP16 VAE to https://huggingface.co/amd-shark/sdxl-quant-models/tree/main/vae

Let's clean this up and use it by default in the pipeline.

monorimet commented 4 months ago

https://github.com/nod-ai/SHARK-Turbine/commit/e46a2a226938ba4f4b8ee23a11959db901194eb7 switches the default f16 path to use this model. Perhaps there is a better way to instantiate it -- we are essentially using the base sdxl-vae VAE config and pulling in the amd-shark quantized VAE as a state dict.

nickfraser commented 4 months ago

Don't think of this as a "quantized VAE". We've just applied some linear transformations to weights of some layers of VAE such that the internal activations do not overflow FP16. As such, the state dict should be compatible with a standard VAE config. There's no quantization parameters associated with this network.

TL;DR - I think what you're doing is the correct approach.