open-mmlab / PowerPaint

[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model. 一个高质量多功能的图像修补模型,可以同时支持插入物体、移除物体、图像扩展、形状可控的物体生成,只需要一个模型
https://powerpaint.github.io/
MIT License
364 stars 18 forks source link

Does this work with SDXL? #30

Open tiimgreen opened 2 months ago

tiimgreen commented 2 months ago

I tried a quick example by changing the the dimensions to 1024 and the models to be the SDXL base models instead of SDv1.5

pipe = Pipeline.from_pretrained(
    "diffusers/stable-diffusion-xl-1.0-inpainting-0.1",
    torch_dtype=weight_dtype
)
pipe.tokenizer = TokenizerWrapper(
    from_pretrained="stabilityai/stable-diffusion-xl-base-1.0",
    subfolder="tokenizer",
    revision=None
)
add_tokens(
    tokenizer=pipe.tokenizer,
    text_encoder=pipe.text_encoder,
    placeholder_tokens=['P_ctxt', 'P_shape', 'P_obj'],
    initialize_tokens=['a', 'a', 'a'],
    num_vectors_per_token=10
)

But I get a bunch of errors like this:

"\u0002\u0000\u0000\u0000\u0000\u0000\u0000�\tsize mismatch for up_blocks.2.resnets.2.time_emb_proj.bias: copying a param with shape torch.Size([640]) from checkpoint, the shape in current model is torch.Size([320])."