Open soumik-kanad opened 1 year ago
This is what I use for inference -
from diffusers import StableDiffusionInpaintPipeline
device = "cuda"
model_path = "runwayml/stable-diffusion-inpainting"
pipe = StableDiffusionInpaintPipeline.from_pretrained(
model_path,
torch_dtype=torch.float16,
).to(device)
lora_scale=0.5
prompt="photo of <s1><s2>"
lora_model_path = "outputs/final_lora.safetensors"
patch_pipe(
pipe,
lora_model_path,
patch_text=True,
patch_ti=True,
patch_unet=True,
)
torch.manual_seed(0)
tune_lora_scale(pipe.unet, lora_scale)
tune_lora_scale(pipe.text_encoder, lora_scale)
image = Image.open("image2.jpg").convert("RGB").resize((512,512))
mask_image = Image.open("mask2.png").convert("RGB").resize((512,512))
#kept giving nsfw warning and black images
def dummy(images, **kwargs):
return images, False
pipe.safety_checker = dummy
image = pipe(prompt=prompt,
image = image,
mask_image=mask_image,
num_inference_steps=50,
guidance_scale=7).images[0]
display(image)
similar result, have you solve the problem?
Nope. I'm not sure how to solve this.
Can someone please help me figure out why inpainting is not working for me while basic image generation seems to be working?
I have a folder with 500 images of an identity sampled from a few videos.
I tried training with the basic lora model with the following flags and it works
lora_scale=0.5, prompt="style of <s1><s2>"
But when I tried training an inpainting model for the same dataset with the default inpainting flags, it gives garbage. (I made a small change: used
--use_template="object"
so that--placeholder_token_at_data="<krk>|<s1><s2>"
does not get rid of the custom tokens and uses the object text templates)inputs:
Without lora patching it looks fine -
prompt="photo of <s2>"
But as soon as I patch this model it gives garbage outputs -
lora_scale=0.0, prompt="photo of <s1>"
lora_scale=0.0 prompt="photo of <s2>"
lora_scale=0.5, prompt="photo of <s1><s2>"
lora_scale=0.5 prompt="photo of <s1>"
lora_scale=0.5 prompt="photo of <s2>"
I also tried varying the lora_scale, but that doesn't help (as seen in the variation from 0 to 0.5). I also tried different prompts and that also didn't help.