Tencent / HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
https://dit.hunyuan.tencent.com/
Other
3.33k stars 285 forks source link

能生成1080P的图片吗 #68

Closed mrlihellohorld closed 4 months ago

mrlihellohorld commented 4 months ago
  1. 请问--size-cond 这个参数是什么意思呢
  2. 我想生成[1080, 1920]分辨率的,只需要修改--image-size这个参数吗。我测试发现,修改之后100步需要10分钟(3090,生成四张图batchsize=4)
  3. 生成其他分辨率需要对应修改--size-cond参数吗
    ### Tasks
zml-ai commented 4 months ago
  1. The --size-cond parameter is used to adjust the domain in which the generated image exists. By tweaking this parameter, you can make your generated images lean towards the corresponding size of images in the training dataset. For the underlying principle, you can refer to Stable Diffusion XL.

  2. Yes, increasing the resolution of the generated images will result in a longer token sequence, which significantly increases the inference time.

  3. Generating images of different resolutions and experimenting with adjusting the --size-cond parameter could lead to varied outcomes. You might achieve better semantics or higher image quality. Please feel free to experiment.