Closed cwchenwang closed 5 months ago
I think I have figured out how to do: scale image -> scale latents -> / vae.scaling_factor -> * vae.scaling_factor -> unscale latents -> unscale images.
can you explain why should in this order? because I met the same confuse, I think the right order is image -> (self.image_processor.preprocess) -> scale_image -> (self.vae.encode) -> *self.vae.config.scaling_factor -> scale_latent -> latent but the color is not same..
The scale and unscale operations should be symmetric.
I noticed in the code that when performing generation from gaussian noises, we need to first unscale_latents -> divide vae.config.scaling_factor -> vae decode -> unscale images to get the final image. However, when I tried to directly denoise an input image, how should I apply the scale operations during the encoding and decoding process? I tried the following code:
However, the denoised output (left) has different color than the input image (right):![image](https://github.com/SUDO-AI-3D/zero123plus/assets/23579918/30243347-d071-42d6-8fd0-dedec1128955)
If I deleted all the scale, unscale functions, the results seem to be correct. So I am confused how to use these scale, unscale functions?