Open JemiloII opened 1 month ago
Where does "/assets/vae/orangemix.vae.pt" come from?
Cc: @DN6 for single file.
Any VAE that's saved in a pt format
Link To VAE I'm Using https://huggingface.co/WarriorMama777/OrangeMixs/blob/main/VAEs/orangemix.vae.pt
I know there is a diffuser version of this, however that doesn't work. It was broken in v0.27.2 so I switched to safetensors. Plus, the diffusers version doesn't have VAE working anyways.
Hi @JemiloII the issue here isn't the pt
format. Rather that the checkpoint contains serialised objects that are not model weights. See attached screenshot below.
We switched to not allowing loading arbitrary serialised objects from pickle files after 0.27.2
since this is a potential security risk. Using torch.load
with weights_only=False
allows executing code on the users machine. See attached discussions:
https://github.com/pytorch/pytorch/issues/52181
https://github.com/pytorch/pytorch/issues/52596
https://github.com/voicepaw/so-vits-svc-fork/issues/193
You can load the VAE state dict with weights_only=False
in the following way
import torch
from huggingface_hub import hf_hub_download
from diffusers import AutoencoderKL
state_dict = torch.load(hf_hub_download("WarriorMama777/OrangeMixs", filename="VAEs/orangemix.vae.pt"), weights_only=False)
vae = AutoencoderKL.from_single_file(state_dict)
That doesn't work.
Traceback (most recent call last):
File "C:\Users\Shibiko AI\AppData\Roaming\JetBrains\IntelliJIdea2024.2\plugins\python\helpers-pro\pydevd_asyncio\pydevd_nest_asyncio.py", line 138, in run
return loop.run_until_complete(task)
File "C:\Users\Shibiko AI\AppData\Roaming\JetBrains\IntelliJIdea2024.2\plugins\python\helpers-pro\pydevd_asyncio\pydevd_nest_asyncio.py", line 243, in run_until_complete
return f.result()
File "C:\Program Files\Python310\lib\asyncio\futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "C:\Program Files\Python310\lib\asyncio\tasks.py", line 232, in __step
result = coro.send(None)
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\main.py", line 81, in main
pipe, clip_layers = shibiko_init(settings, device)
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\src\generation.py", line 144, in shibiko_init
pipe.vae = AutoencoderKL.from_single_file(state_dict)
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\.venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\.venv\lib\site-packages\diffusers\loaders\autoencoder.py", line 119, in from_single_file
original_config, checkpoint = fetch_ldm_config_and_checkpoint(
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\.venv\lib\site-packages\diffusers\loaders\single_file_utils.py", line 314, in fetch_ldm_config_and_checkpoint
checkpoint = load_single_file_model_checkpoint(
File "C:\Users\Shibiko AI\Desktop\shibiko ai\diffusion-ai\.venv\lib\site-packages\diffusers\loaders\single_file_utils.py", line 339, in load_single_file_model_checkpoint
if os.path.isfile(pretrained_model_link_or_path):
File "C:\Program Files\Python310\lib\genericpath.py", line 30, in isfile
st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not dict
python-BaseException
Process finished with exit code 1
I even tried with pipe.vae.load_state_dict
. No dice there. I even went to try that on 0.27.2...
Not sure if this is helpful, but the state_dict has these keys.
encoder.conv_in.weight
encoder.conv_in.bias
encoder.down.0.block.0.norm1.weight
encoder.down.0.block.0.norm1.bias
encoder.down.0.block.0.conv1.weight
encoder.down.0.block.0.conv1.bias
encoder.down.0.block.0.norm2.weight
encoder.down.0.block.0.norm2.bias
encoder.down.0.block.0.conv2.weight
encoder.down.0.block.0.conv2.bias
encoder.down.0.block.1.norm1.weight
encoder.down.0.block.1.norm1.bias
encoder.down.0.block.1.conv1.weight
encoder.down.0.block.1.conv1.bias
encoder.down.0.block.1.norm2.weight
encoder.down.0.block.1.norm2.bias
encoder.down.0.block.1.conv2.weight
encoder.down.0.block.1.conv2.bias
encoder.down.0.downsample.conv.weight
encoder.down.0.downsample.conv.bias
encoder.down.1.block.0.norm1.weight
encoder.down.1.block.0.norm1.bias
encoder.down.1.block.0.conv1.weight
encoder.down.1.block.0.conv1.bias
encoder.down.1.block.0.norm2.weight
encoder.down.1.block.0.norm2.bias
encoder.down.1.block.0.conv2.weight
encoder.down.1.block.0.conv2.bias
encoder.down.1.block.0.nin_shortcut.weight
encoder.down.1.block.0.nin_shortcut.bias
encoder.down.1.block.1.norm1.weight
encoder.down.1.block.1.norm1.bias
encoder.down.1.block.1.conv1.weight
encoder.down.1.block.1.conv1.bias
encoder.down.1.block.1.norm2.weight
encoder.down.1.block.1.norm2.bias
encoder.down.1.block.1.conv2.weight
encoder.down.1.block.1.conv2.bias
encoder.down.1.downsample.conv.weight
encoder.down.1.downsample.conv.bias
encoder.down.2.block.0.norm1.weight
encoder.down.2.block.0.norm1.bias
encoder.down.2.block.0.conv1.weight
encoder.down.2.block.0.conv1.bias
encoder.down.2.block.0.norm2.weight
encoder.down.2.block.0.norm2.bias
encoder.down.2.block.0.conv2.weight
encoder.down.2.block.0.conv2.bias
encoder.down.2.block.0.nin_shortcut.weight
encoder.down.2.block.0.nin_shortcut.bias
encoder.down.2.block.1.norm1.weight
encoder.down.2.block.1.norm1.bias
encoder.down.2.block.1.conv1.weight
encoder.down.2.block.1.conv1.bias
encoder.down.2.block.1.norm2.weight
encoder.down.2.block.1.norm2.bias
encoder.down.2.block.1.conv2.weight
encoder.down.2.block.1.conv2.bias
encoder.down.2.downsample.conv.weight
encoder.down.2.downsample.conv.bias
encoder.down.3.block.0.norm1.weight
encoder.down.3.block.0.norm1.bias
encoder.down.3.block.0.conv1.weight
encoder.down.3.block.0.conv1.bias
encoder.down.3.block.0.norm2.weight
encoder.down.3.block.0.norm2.bias
encoder.down.3.block.0.conv2.weight
encoder.down.3.block.0.conv2.bias
encoder.down.3.block.1.norm1.weight
encoder.down.3.block.1.norm1.bias
encoder.down.3.block.1.conv1.weight
encoder.down.3.block.1.conv1.bias
encoder.down.3.block.1.norm2.weight
encoder.down.3.block.1.norm2.bias
encoder.down.3.block.1.conv2.weight
encoder.down.3.block.1.conv2.bias
encoder.mid.block_1.norm1.weight
encoder.mid.block_1.norm1.bias
encoder.mid.block_1.conv1.weight
encoder.mid.block_1.conv1.bias
encoder.mid.block_1.norm2.weight
encoder.mid.block_1.norm2.bias
encoder.mid.block_1.conv2.weight
encoder.mid.block_1.conv2.bias
encoder.mid.attn_1.norm.weight
encoder.mid.attn_1.norm.bias
encoder.mid.attn_1.q.weight
encoder.mid.attn_1.q.bias
encoder.mid.attn_1.k.weight
encoder.mid.attn_1.k.bias
encoder.mid.attn_1.v.weight
encoder.mid.attn_1.v.bias
encoder.mid.attn_1.proj_out.weight
encoder.mid.attn_1.proj_out.bias
encoder.mid.block_2.norm1.weight
encoder.mid.block_2.norm1.bias
encoder.mid.block_2.conv1.weight
encoder.mid.block_2.conv1.bias
encoder.mid.block_2.norm2.weight
encoder.mid.block_2.norm2.bias
encoder.mid.block_2.conv2.weight
encoder.mid.block_2.conv2.bias
encoder.norm_out.weight
encoder.norm_out.bias
encoder.conv_out.weight
encoder.conv_out.bias
decoder.conv_in.weight
decoder.conv_in.bias
decoder.mid.block_1.norm1.weight
decoder.mid.block_1.norm1.bias
decoder.mid.block_1.conv1.weight
decoder.mid.block_1.conv1.bias
decoder.mid.block_1.norm2.weight
decoder.mid.block_1.norm2.bias
decoder.mid.block_1.conv2.weight
decoder.mid.block_1.conv2.bias
decoder.mid.attn_1.norm.weight
decoder.mid.attn_1.norm.bias
decoder.mid.attn_1.q.weight
decoder.mid.attn_1.q.bias
decoder.mid.attn_1.k.weight
decoder.mid.attn_1.k.bias
decoder.mid.attn_1.v.weight
decoder.mid.attn_1.v.bias
decoder.mid.attn_1.proj_out.weight
decoder.mid.attn_1.proj_out.bias
decoder.mid.block_2.norm1.weight
decoder.mid.block_2.norm1.bias
decoder.mid.block_2.conv1.weight
decoder.mid.block_2.conv1.bias
decoder.mid.block_2.norm2.weight
decoder.mid.block_2.norm2.bias
decoder.mid.block_2.conv2.weight
decoder.mid.block_2.conv2.bias
decoder.up.0.block.0.norm1.weight
decoder.up.0.block.0.norm1.bias
decoder.up.0.block.0.conv1.weight
decoder.up.0.block.0.conv1.bias
decoder.up.0.block.0.norm2.weight
decoder.up.0.block.0.norm2.bias
decoder.up.0.block.0.conv2.weight
decoder.up.0.block.0.conv2.bias
decoder.up.0.block.0.nin_shortcut.weight
decoder.up.0.block.0.nin_shortcut.bias
decoder.up.0.block.1.norm1.weight
decoder.up.0.block.1.norm1.bias
decoder.up.0.block.1.conv1.weight
decoder.up.0.block.1.conv1.bias
decoder.up.0.block.1.norm2.weight
decoder.up.0.block.1.norm2.bias
decoder.up.0.block.1.conv2.weight
decoder.up.0.block.1.conv2.bias
decoder.up.0.block.2.norm1.weight
decoder.up.0.block.2.norm1.bias
decoder.up.0.block.2.conv1.weight
decoder.up.0.block.2.conv1.bias
decoder.up.0.block.2.norm2.weight
decoder.up.0.block.2.norm2.bias
decoder.up.0.block.2.conv2.weight
decoder.up.0.block.2.conv2.bias
decoder.up.1.block.0.norm1.weight
decoder.up.1.block.0.norm1.bias
decoder.up.1.block.0.conv1.weight
decoder.up.1.block.0.conv1.bias
decoder.up.1.block.0.norm2.weight
decoder.up.1.block.0.norm2.bias
decoder.up.1.block.0.conv2.weight
decoder.up.1.block.0.conv2.bias
decoder.up.1.block.0.nin_shortcut.weight
decoder.up.1.block.0.nin_shortcut.bias
decoder.up.1.block.1.norm1.weight
decoder.up.1.block.1.norm1.bias
decoder.up.1.block.1.conv1.weight
decoder.up.1.block.1.conv1.bias
decoder.up.1.block.1.norm2.weight
decoder.up.1.block.1.norm2.bias
decoder.up.1.block.1.conv2.weight
decoder.up.1.block.1.conv2.bias
decoder.up.1.block.2.norm1.weight
decoder.up.1.block.2.norm1.bias
decoder.up.1.block.2.conv1.weight
decoder.up.1.block.2.conv1.bias
decoder.up.1.block.2.norm2.weight
decoder.up.1.block.2.norm2.bias
decoder.up.1.block.2.conv2.weight
decoder.up.1.block.2.conv2.bias
decoder.up.1.upsample.conv.weight
decoder.up.1.upsample.conv.bias
decoder.up.2.block.0.norm1.weight
decoder.up.2.block.0.norm1.bias
decoder.up.2.block.0.conv1.weight
decoder.up.2.block.0.conv1.bias
decoder.up.2.block.0.norm2.weight
decoder.up.2.block.0.norm2.bias
decoder.up.2.block.0.conv2.weight
decoder.up.2.block.0.conv2.bias
decoder.up.2.block.1.norm1.weight
decoder.up.2.block.1.norm1.bias
decoder.up.2.block.1.conv1.weight
decoder.up.2.block.1.conv1.bias
decoder.up.2.block.1.norm2.weight
decoder.up.2.block.1.norm2.bias
decoder.up.2.block.1.conv2.weight
decoder.up.2.block.1.conv2.bias
decoder.up.2.block.2.norm1.weight
decoder.up.2.block.2.norm1.bias
decoder.up.2.block.2.conv1.weight
decoder.up.2.block.2.conv1.bias
decoder.up.2.block.2.norm2.weight
decoder.up.2.block.2.norm2.bias
decoder.up.2.block.2.conv2.weight
decoder.up.2.block.2.conv2.bias
decoder.up.2.upsample.conv.weight
decoder.up.2.upsample.conv.bias
decoder.up.3.block.0.norm1.weight
decoder.up.3.block.0.norm1.bias
decoder.up.3.block.0.conv1.weight
decoder.up.3.block.0.conv1.bias
decoder.up.3.block.0.norm2.weight
decoder.up.3.block.0.norm2.bias
decoder.up.3.block.0.conv2.weight
decoder.up.3.block.0.conv2.bias
decoder.up.3.block.1.norm1.weight
decoder.up.3.block.1.norm1.bias
decoder.up.3.block.1.conv1.weight
decoder.up.3.block.1.conv1.bias
decoder.up.3.block.1.norm2.weight
decoder.up.3.block.1.norm2.bias
decoder.up.3.block.1.conv2.weight
decoder.up.3.block.1.conv2.bias
decoder.up.3.block.2.norm1.weight
decoder.up.3.block.2.norm1.bias
decoder.up.3.block.2.conv1.weight
decoder.up.3.block.2.conv1.bias
decoder.up.3.block.2.norm2.weight
decoder.up.3.block.2.norm2.bias
decoder.up.3.block.2.conv2.weight
decoder.up.3.block.2.conv2.bias
decoder.up.3.upsample.conv.weight
decoder.up.3.upsample.conv.bias
decoder.norm_out.weight
decoder.norm_out.bias
decoder.conv_out.weight
decoder.conv_out.bias
loss.logvar
loss.perceptual_loss.scaling_layer.shift
loss.perceptual_loss.scaling_layer.scale
loss.perceptual_loss.net.slice1.0.weight
loss.perceptual_loss.net.slice1.0.bias
loss.perceptual_loss.net.slice1.2.weight
loss.perceptual_loss.net.slice1.2.bias
loss.perceptual_loss.net.slice2.5.weight
loss.perceptual_loss.net.slice2.5.bias
loss.perceptual_loss.net.slice2.7.weight
loss.perceptual_loss.net.slice2.7.bias
loss.perceptual_loss.net.slice3.10.weight
loss.perceptual_loss.net.slice3.10.bias
loss.perceptual_loss.net.slice3.12.weight
loss.perceptual_loss.net.slice3.12.bias
loss.perceptual_loss.net.slice3.14.weight
loss.perceptual_loss.net.slice3.14.bias
loss.perceptual_loss.net.slice4.17.weight
loss.perceptual_loss.net.slice4.17.bias
loss.perceptual_loss.net.slice4.19.weight
loss.perceptual_loss.net.slice4.19.bias
loss.perceptual_loss.net.slice4.21.weight
loss.perceptual_loss.net.slice4.21.bias
loss.perceptual_loss.net.slice5.24.weight
loss.perceptual_loss.net.slice5.24.bias
loss.perceptual_loss.net.slice5.26.weight
loss.perceptual_loss.net.slice5.26.bias
loss.perceptual_loss.net.slice5.28.weight
loss.perceptual_loss.net.slice5.28.bias
loss.perceptual_loss.lin0.model.1.weight
loss.perceptual_loss.lin1.model.1.weight
loss.perceptual_loss.lin2.model.1.weight
loss.perceptual_loss.lin3.model.1.weight
loss.perceptual_loss.lin4.model.1.weight
loss.discriminator.main.0.weight
loss.discriminator.main.0.bias
loss.discriminator.main.2.weight
loss.discriminator.main.3.weight
loss.discriminator.main.3.bias
loss.discriminator.main.3.running_mean
loss.discriminator.main.3.running_var
loss.discriminator.main.3.num_batches_tracked
loss.discriminator.main.5.weight
loss.discriminator.main.6.weight
loss.discriminator.main.6.bias
loss.discriminator.main.6.running_mean
loss.discriminator.main.6.running_var
loss.discriminator.main.6.num_batches_tracked
loss.discriminator.main.8.weight
loss.discriminator.main.9.weight
loss.discriminator.main.9.bias
loss.discriminator.main.9.running_mean
loss.discriminator.main.9.running_var
loss.discriminator.main.9.num_batches_tracked
loss.discriminator.main.11.weight
loss.discriminator.main.11.bias
quant_conv.weight
quant_conv.bias
post_quant_conv.weight
post_quant_conv.bias
Which version of diffusers are you using? The snippet I shared is meant to be run with the >0.27.2 version. Based on the traceback it seems like you're using version 0.27.2 to try and load the state dict?
I tried with 0.27.2 and the latest release. The goal is to update to the latest, but the diffusers keeps making breaking changes. This PT one is crazy as many vsts for sd1.5 are in pt format and many i've found for sdxl are in that format as well.
When an update happens, I don't expect my production app to just break from making zero code changes. I don't use hub, but i even tried with your hub example. I don't like using the hub, just local files so i know nothing ever changes. hub will get updates. hub saves things in mysterious locations where you don't want model files anyways.
Describe the bug
Just doesn't load pt files anymore. Really frustrating as it's been broken for a long time now. I keep posting about it, so now I'll just open a issue / bug instead of messaging in update threads. Last working version is 0.27.2
There is not a working safetensors or diffusers version of the VAE I'm using and I shouldn't have to. PT works just fine.
Reproduction
Logs
System Info
Python 3.10.9
AMD 7950X3D | AMD 5950X RTX 4090 x2 | RTX 4090 128GB DDR5 | 128GB DDR4 Windows 10 | Windows 10
Who can help?
@sayakpaul