Closed mrlihellohorld closed 4 months ago
The --size-cond
parameter is used to adjust the domain in which the generated image exists. By tweaking this parameter, you can make your generated images lean towards the corresponding size of images in the training dataset. For the underlying principle, you can refer to Stable Diffusion XL.
Yes, increasing the resolution of the generated images will result in a longer token sequence, which significantly increases the inference time.
Generating images of different resolutions and experimenting with adjusting the --size-cond
parameter could lead to varied outcomes. You might achieve better semantics or higher image quality. Please feel free to experiment.