YanzuoLu / CFLD

[CVPR 2024 Highlight] Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
MIT License
183 stars 12 forks source link

Inference image resolution 256x176 #18

Closed triton99 closed 6 months ago

triton99 commented 6 months ago

Hi @YanzuoLu , Thanks for sharing this great work!

I want to run the inference for the image resolution 256x176. Could you share the instruction and the config file for your model? In the playground.ipynb, the latent output size is 64x64, and the output of the value decoder is 512x512 now.

Thank you.

YanzuoLu commented 6 months ago

To use pre-trained SD15 we always have only one model to generate 512x512. Evaluation is achieved through resizing. Hope this helps.