Closed Strike1999 closed 5 months ago
Thanks for your inspiring work!
However, I encountered a problem. When I use the model trained on COCO-Stuff and image size is 512*512, the generation quality seems poor.
The prompt from coco-stuff is:
layout = { "bbox": [ ['metal', 0.04218750074505806, 0.25647059082984924, 0.10000000149011612, 0.5247058868408203], ['chair', 0.17940625548362732, 0.4312705993652344, 0.35014063119888306, 0.5062353014945984], ['sky-other', 0.606249988079071, 0.0, 0.734375, 0.09882353246212006], ['person', 0.0, 0.5493882298469543, 0.07332812249660492, 0.7298117876052856], ['pavement', 0.0, 0.5976470708847046, 0.9781249761581421, 1.0], ['building-other', 0.0, 0.0, 1.0, 0.7152941226959229], ['person', 0.8331093788146973, 0.5236706137657166, 0.913937509059906, 0.8113176226615906], ['chair', 0.422062486410141, 0.4221176505088806, 0.6030937433242798, 0.499505877494812], ['bus', 0.1626562476158142, 0.29044705629348755, 0.8476094007492065, 0.9376470446586609], ['person', 0.32343751192092896, 0.3623529374599457, 0.792187511920929, 0.5176470875740051], ['person', 0.9270156025886536, 0.49814116954803467, 0.9953437447547913, 0.8023764491081238], ['clothes', 0.15000000596046448, 0.567058801651001, 1.0, 1.0] ] }
The generation config is:
{ "dataset": "coco_stuff", "num_bucket_per_side": [256, 256], "width": 512, "height": 512, "prompt_template": "An image with {bbox}", "cfg_scale": 4.5, "num_inference_steps": 50, "max_num_bbox": 18 }
However, the generation result seems strange using run_layout_to_image.py:
run_layout_to_image.py
I've tried different prompts and the results are very confusing.
What's wrong with my operation? Thanks!
Thank you for your feedback. The issues have been resolved via emails.
Thanks for your inspiring work!
However, I encountered a problem. When I use the model trained on COCO-Stuff and image size is 512*512, the generation quality seems poor.
The prompt from coco-stuff is:
The generation config is:
However, the generation result seems strange using
run_layout_to_image.py
:I've tried different prompts and the results are very confusing.
What's wrong with my operation? Thanks!