TonyLianLong / LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)
https://llm-grounded-diffusion.github.io/
381 stars 24 forks source link

How to Swap Object Positions While Maintaining Consistent Background in Image Synthesis? #15

Open rabiulcste opened 5 months ago

rabiulcste commented 5 months ago

I'm working on image synthesis with a focus on vision-language fine-grained understanding. I'm facing a challenge in generating two images that maintain a consistent background but swap the positions of two objects (e.g., a dog on the left and a cat on the right in the first image, and vice versa in the second image).

I've tried fixing seed and bounding box location only swapping object names but it doesn't seem to be working. Any guidance would be greatly appreciated.

TonyLianLong commented 5 months ago

You can visualize the first stage of generation (i.e., individual box generation) to see if the appearance of the objects stays consistent. If they stay consistent, then having higher frozen steps helps preserve the appearance.