black-forest-labs / flux

Official inference repo for FLUX.1 models
Apache License 2.0
13.64k stars 969 forks source link

Wrong image range before vae encoding in gradio demo script #133

Open kingofprank opened 2 weeks ago

kingofprank commented 2 weeks ago

Hi, guys. Thanks for your great jobs. I notice a possible problem when running demo_gr.py. The init_image only rescale to [0, 1], but I think the right range is [-1, 1]. Another case is when setting Noising strength=0 in img2img mode with flux-dev model, we got a overexposed pictures, the vae reconstruction seems fail.

Here is a part of code from demo_gr.py, line 71-79.

if isinstance(init_image, np.ndarray):
    init_image = torch.from_numpy(init_image).permute(2, 0, 1).float() / 255.0
    init_image = init_image.unsqueeze(0) 
init_image = init_image.to(self.device)
init_image = torch.nn.functional.interpolate(init_image, (opts.height, opts.width))
if self.offload:
    self.ae.encoder.to(self.device)
init_image = self.ae.encode(init_image.to())

In demo_st.py, we can find the image rescale to [-1, 1] in get_image() function before vae encoding. We can obtain a normal result.

def get_image() -> torch.Tensor | None:
    image = st.file_uploader("Input", type=["jpg", "JPEG", "png"])
    if image is None:
        return None
    image = Image.open(image).convert("RGB")

    transform = transforms.Compose(
        [
            transforms.ToTensor(),
            transforms.Lambda(lambda x: 2.0 * x - 1.0),
        ]
    )
    img: torch.Tensor = transform(image)
    return img[None, ...]
timudk commented 2 weeks ago

@apolinario can you take a look?