Open YihanHu-2022 opened 2 hours ago
I'd like to use the SDXL VAE to encode my image, but only got NAN values. I have set the input and the vae to full precision (torch.float32), but problem still exists.
import torch from diffusers import StableDiffusionXLPipeline from diffusers import DPMSolverMultistepScheduler import numpy as np from PIL import Image from torch import autocast, inference_mode from PIL import Image from torchvision import transforms as tr p2t = tr.ToTensor() device = torch.device('cuda') if torch.cuda.is_available() else torch.device( 'cpu') NUM_DDIM_STEPS = 50 SKIP = 0.0 ETA=1 TOTAL_STEP = int(NUM_DDIM_STEPS * (1 + SKIP)) model_id = 'stabilityai/stable-diffusion-xl-base-1.0' ldm_stable = StableDiffusionXLPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to(device) ldm_stable.scheduler = DPMSolverMultistepScheduler.from_config(model_id, subfolder = "scheduler", algorithm_type="sde-dpmsolver++", solver_order=2) ldm_stable.scheduler.config.timestep_spacing = "leading" ldm_stable.scheduler.set_timesteps(TOTAL_STEP) image_gt = Image.open('path/to/image.png').convert('RGB') image_gt = image_gt.resize((1024, 1024)) image_gt = p2t(image_gt) * 2 - 1 image_gt = image_gt.unsqueeze(0).to(device, dtype = torch.float32) ldm_stable.vae.to(dtype=torch.float32) with autocast("cuda"), inference_mode(): w0 = ldm_stable.vae.encode(image_gt).latent_dist.sample() print(w0)
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:09<00:00, 1.36s/it] /root/miniforge3/lib/python3.10/site-packages/diffusers/configuration_utils.py:245: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0. deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False) tensor([[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]], device='cuda:0')
Diffusers: 0.30.0 Pytorch: 1.12 transforms: 4.45.2 No XFormers
Running on RTX 3090Ti CUDA Version: 11.7
Python version 3.10.14
@yiyixuxu @sayakpaul @DN6
Could this be because of the scheduler you're using? Does this happen when you use the default scheduler?
Describe the bug
I'd like to use the SDXL VAE to encode my image, but only got NAN values. I have set the input and the vae to full precision (torch.float32), but problem still exists.
Reproduction
Logs
System Info
Diffusers: 0.30.0 Pytorch: 1.12 transforms: 4.45.2 No XFormers
Running on RTX 3090Ti CUDA Version: 11.7
Python version 3.10.14
Who can help?
@yiyixuxu @sayakpaul @DN6