open-mmlab / PowerPaint

[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model. 一个高质量多功能的图像修补模型,可以同时支持插入物体、移除物体、图像扩展、形状可控的物体生成,只需要一个模型
https://powerpaint.github.io/
MIT License
584 stars 38 forks source link

It seems that UNet and VAE are missing from the PowerPaint v2-1 model #50

Closed hjj-lmx closed 3 months ago

zengyh1900 commented 3 months ago

hi @hjj-lmx,

we have checked it and update our readme with more detailed introduction steps. Please check it again. BTW, further discussion is always welcomed :)

hjj-lmx commented 3 months ago

你好@hjj-lmx,

我们已经检查过了,并且更新了我们的 readme 文件,添加了更详细的介绍步骤。 请再检查一遍。顺便说一句,欢迎进一步讨论 :)

1719557253(1) Isn't there no model here? Not V2, but V2-1

This is something I wrote locally myself: def create_brushNet_pipeline(self): hub_dir = get_dir() base_model_path = os.path.join(hub_dir, "checkpoints\PowerPaint-v2-1\realisticVisionV60B1_v51VAE") dtype = torch.float16 unet = UNet2DConditionModel.from_pretrained( "runwayml/stable-diffusion-v1-5", subfolder="unet", revision=None, torch_dtype=dtype ) text_encoder_brushnet = CLIPTextModel.from_pretrained( "runwayml/stable-diffusion-v1-5", subfolder="text_encoder", torch_dtype=dtype ) brushnet = BrushNetModel.from_unet(unet)

    pipe = StableDiffusionPowerPaintBrushNetPipeline.from_pretrained(
        base_model_path,
        brushnet=brushnet,
        text_encoder_brushnet=text_encoder_brushnet,
        torch_dtype=dtype,
        low_cpu_mem_usage=False,
        safety_checker=None,
    )
    pipe.unet = UNet2DConditionModel.from_pretrained(base_model_path, subfolder="unet", revision=None,
                                                     torch_dtype=dtype)
    pipe.tokenizer = TokenizerWrapper(from_pretrained=base_model_path, subfolder="tokenizer", revision=None)
    add_tokens(
        tokenizer=pipe.tokenizer,
        text_encoder=pipe.text_encoder_brushnet,
        placeholder_tokens=["P_ctxt", "P_shape", "P_obj"],
        initialize_tokens=["a", "a", "a"],
        num_vectors_per_token=10,
    )
    load_model(pipe.brushnet, os.path.join(hub_dir, "checkpoints\\PowerPaint-v2-1\\PowerPaint_Brushnet\\diffusion_pytorch_model.safetensors"))

    pipe.text_encoder_brushnet.load_state_dict(
        torch.load(os.path.join(hub_dir, "checkpoints\\PowerPaint-v2-1\\PowerPaint_Brushnet\\pytorch_model.bin")),
        strict=False
    )
    pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
    pipe.enable_model_cpu_offload()
    return pipe

There will be a warning when starting the load: Loading pipeline components...: 0%| | 0/6 [00:00<?, ?it/s]An error occurred while trying to fetch C:\Users\24708.cache\torch\hub\checkpoints\PowerPaint-v2-1\realisticVisionV60B1_v51VAE\unet: Error no file named diffusion_pytorch_model.safetensors found in directory C:\Users\24708.cache\torch\hub\checkpoints\PowerPaint-v2-1\realisticVisionV60B1_v51VAE\unet. Defaulting to unsafe serialization. Pass allow_pickle=False to raise an error instead. Loading pipeline components...: 83%|████████▎ | 5/6 [00:15<00:01, 1.95s/it]An error occurred while trying to fetch C:\Users\24708.cache\torch\hub\checkpoints\PowerPaint-v2-1\realisticVisionV60B1_v51VAE\vae: Error no file named diffusion_pytorch_model.safetensors found in directory C:\Users\24708.cache\torch\hub\checkpoints\PowerPaint-v2-1\realisticVisionV60B1_v51VAE\vae. Defaulting to unsafe serialization. Pass allow_pickle=False to raise an error instead. Loading pipeline components...: 100%|██████████| 6/6 [00:16<00:00, 2.70s/it] You have disabled the safety checker for <class 'ud_cleaner.power_paint.pipeline.pipeline_PowerPaint_Brushnet_CA.StableDiffusionPowerPaintBrushNetPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .

hjj-lmx commented 3 months ago

hi @hjj-lmx,

we have checked it and update our readme with more detailed introduction steps. Please check it again. BTW, further discussion is always welcomed :)

I also encountered a problem where the original part of the image that was generated using "image out painting" seems to have been changed. Is there a way to solve this problem?

你好@hjj-lmx,

我们已经检查过了,并且更新了我们的 readme 文件,添加了更详细的介绍步骤。 请再检查一遍。顺便说一句,欢迎进一步讨论 :)

I also encountered a problem where the original part of the image that was generated using "image out painting" seems to have been changed. Is there a way to solve this problem?