Deepcopy model for further use

Hi,

I have a SDXL model that I would like to use stable-fast for acceleration. However, in our implementation we have our way to apply lora to the model. The way we did it before is deep copy the model first and then apply lora (fuse the weight) and used the copied model for inference, and delete the deep copied model after the run (reason is to prevent from lora fried issues as the way we load lora change the weight of the model).

However, when introducing stable-fast, the original model is loaded and compile (as I understand the model is registered in this step and the following tracing will based on this), if we use the same way to deep copy the model, apply lora and then do the tracing, it would have weird errors. One error is since the VAE need to be float32 in the SDXL img2img for latent generation, it has the error indicating that the vae is fp16 (I think this is because the original model in the compile registration has fp16 vae) while the image is fp32.

Do you think if there would be a way to solve it or how the registration of the compiled model work in stable-fast? Thanks!

chengzeyi / stable-fast

Deepcopy model for further use #114