Inquiry about frame dim, 16 for inference and 14 for training?

fudan-generative-vision / hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

MIT License

9.25k stars 1.27k forks source link

Open Nyquist0 opened 3 weeks ago

Nyquist0 commented 3 weeks ago

Hi,

I found the frame dimension you are using is confusing.

For inference.py, you use 16. For training period, you use 14.

May I ask which one is better? I assume larger means better?