Closed mechigonft closed 6 months ago
The bold prompt and ng_prompt are actually the code that should be commented. # is added in front of the code, which becomes bold on the github platform.
I hope to get your valuable suggestions for scenarios like this, background replacement, background generation, and background expansion.
@mechigonft It would be better to ask this question in the Discussions section.
@DN6 OK, thx
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
closing this! feel free to open a question in discussion section
Describe the bug
I'm using the instruct_pix2pix training method to regenerate backgrounds for cut-out food images. However, I've noticed that the generated backgrounds often contain numerous fragmented and distorted cups, plates, and bowls. What could be the reason for this? I've examined my training data, and although it also includes cups, plates, and bowls, there is only one of each, and all are in their normal shape. Could you help me look into this issue? cut-out food image: after regenerating the background: my training data example: input_image: edited_image:
Reproduction
training script:
export MODEL_NAME="/models/stable-diffusion-v1-5" export DATASET_ID="" export OUTPUT_DIR=""
accelerate launch --mixed_precision="fp16" /ossfs/workspace/diffusers/examples/instruct_pix2pix/train_instruct_pix2pix.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$DATASET_ID \ --enable_xformers_memory_efficient_attention \ --resolution=256 --random_flip \ --train_batch_size=1 --gradient_accumulation_steps=1 --gradient_checkpointing \ --max_train_steps=5000 \ --checkpointing_steps=10000 --checkpoints_total_limit=1 \ --learning_rate=5e-05 --max_grad_norm=1 --lr_warmup_steps=0 \ --conditioning_dropout_prob=0.05 \ --mixed_precision=fp16 \ --seed=42 \ --output_dir=$OUTPUT_DIR
inference script:
import PIL import requests import torch from diffusers import StableDiffusionInstructPix2PixPipeline
model_id = '' # <- replace this pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda") generator = torch.Generator("cuda").manual_seed(0)
image_path = '/ossfs/workspace/result.png' def download_image(image_path): image = PIL.Image.open(image_path) image = PIL.ImageOps.exif_transpose(image) image = image.convert("RGB") return image
image = download_image(image_path)
prompt = 'replace the background with a clean and concise background, simple and clean'
prompt = 'replace the background picture to pure white background'
prompt = 'extend background' num_inference_steps = 20 image_guidance_scale = 1.5 guidance_scale = 10
edited_image = pipe(prompt,
ng_prompt = 'other food and drinks, white empty cups, white empty bowls, white empty plates, cutlery, knives and forks, chopsticks, complex background',
ng_prompt = 'cups, bowls, plates', image=image, num_inference_steps=num_inference_steps, image_guidance_scale=image_guidance_scale, guidance_scale=guidance_scale, generator=generator, ).images[0] edited_image.save("/ossfs/workspace/result_extend_background.png")
Logs
No response
System Info
$diffusers-cli env Setting ds_accelerator to cuda (auto detect)
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
diffusers
version: 0.28.0.dev0Who can help?
No response