Closed LinB203 closed 2 months ago
I checked the PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
. It also do not has these keys.
But the PixArt-alpha/PixArt-XL-2-1024-MS
has these keys.
I want to inference some not square resolutions such as 512×1024, not 512×512 or 1024×1024.
There is no aspect_ratio_embedder
or resolution_embedder
in all pixart-sigma weights, how can it work? Could you help me? I am confused about it.
@lawrence-cj
Show me your whole command
import torch
from diffusers import Transformer2DModel, PixArtSigmaPipeline
from torch import nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weight_dtype = torch.float16
transformer = Transformer2DModel.from_pretrained(
"PixArt-alpha/PixArt-Sigma-XL-2-512-MS",
subfolder='transformer', cache_dir='./cache_dir',
torch_dtype=weight_dtype,
use_safetensors=True,
)
ValueError: Cannot load <class 'diffusers.models.transformers.transformer_2d.Transformer2DModel'> from PixArt-alpha/PixArt-Sigma-XL-2-512-MS because the following keys are missing:
adaln_single.emb.aspect_ratio_embedder.linear_1.bias, adaln_single.emb.resolution_embedder.linear_1.weight, adaln_single.emb.aspect_ratio_embedder.linear_2.bias, adaln_single.emb.resolution_embedder.linear_2.bias, adaln_single.emb.resolution_embedder.linear_1.bias, adaln_single.emb.resolution_embedder.linear_2.weight, adaln_single.emb.aspect_ratio_embedder.linear_2.weight, adaln_single.emb.aspect_ratio_embedder.linear_1.weight.
Please make sure to pass `low_cpu_mem_usage=False` and `device_map=None` if you want to randomly initialize those weights or else make sure your checkpoint file is correct.
{
"_class_name": "Transformer2DModel",
"_diffusers_version": "0.28.0.dev0",
"activation_fn": "gelu-approximate",
"attention_bias": true,
"attention_head_dim": 72,
"attention_type": "default",
"caption_channels": 4096,
"cross_attention_dim": 1152,
"double_self_attention": false,
"dropout": 0.0,
"in_channels": 4,
"interpolation_scale": 1,
"norm_elementwise_affine": false,
"norm_eps": 1e-06,
"norm_num_groups": 32,
"norm_type": "ada_norm_single",
"num_attention_heads": 16,
"num_embeds_ada_norm": 1000,
"num_layers": 28,
"num_vector_embeds": null,
"only_cross_attention": false,
"out_channels": 8,
"patch_size": 2,
"sample_size": 64,
"upcast_attention": false,
"use_linear_projection": false,
"use_additional_conditions": true # add this line
}
Thank you for replying. I know "use_additional_conditions": true
isn't in the original config.json, but if it does not have use_additional_conditions
, how does the model inference any resolution?
These model weights (PixArt-alpha/PixArt-Sigma-XL-2-512-MS
and PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
) do not have resolution_embedder
or aspect_ratio_embedder
. Am I missing something? Do you have a tutorial that supports any aspect/resolution generation?@lawrence-cj
Change this line: "use_additional_conditions": false
. Generating multi-scale image according to noise shape.
Change this line:
"use_additional_conditions": false
. Generating multi-scale image according to noise shape.
But resolution_embedder
or aspect_ratio_embedder
are added during training, and they don't need to be added during inference?
https://github.com/PixArt-alpha/PixArt-sigma/blob/master/diffusion/model/nets/PixArtMS.py#L188-L191 Are these lines not switched on during training?
Check it first.
But the config file is turned on. https://github.com/PixArt-alpha/PixArt-sigma/blob/6ec1500b079a85e291625e2f5a0c935fd9913f12/configs/pixart_alpha_config/PixArt_xl2_img1024_internalms.py#L32
This is PixArt-Alpha... not PixArt-Sigma
Sorry, I apologise for my stupidity.
Why the weight is named as
PixArt-Sigma-XL-2-512-MS
? I guess that theMS
is multi-scale? Should it have multi-aspect/resolution embedding?https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-512-MS/discussions/1
missing key: adaln_single.emb.resolution_embedder.linear_2.bias, adaln_single.emb.aspect_ratio_embedder.linear_1.bias, adaln_single.emb.resolution_embedder.linear_1.bias, adaln_single.emb.resolution_embedder.linear_1.weight, adaln_single.emb.aspect_ratio_embedder.linear_2.bias, adaln_single.emb.aspect_ratio_embedder.linear_1.weight, adaln_single.emb.resolution_embedder.linear_2.weight, adaln_single.emb.aspect_ratio_embedder.linear_2.weight.