How to train SyncNet with frames of 3 and 7?

joshmoody24 / sitcom-simulator

A tool that combines ChatGPT, Stable Diffusion, FakeYou, and FreePD to create AI-generated videos.

MIT License

98 stars 14 forks source link

How to train SyncNet with frames of 3 and 7? #9

Open wtc9806 opened 7 months ago

wtc9806 commented 7 months ago

Hello, I want to train Syncnet with the number of image sequences at 3 and 7, but I don't know if my configuration is correct. In the case of 5 frames of image, the syncnet_T is 5 and the syncnet_mel_step_size is 16. One frame of image corresponds to 3.2 frames of audio. So, When the input image is 3 and 7 frames, the corresponding syncnet_mel_step_size is 9.6 and 22.4???

joshmoody24 commented 6 months ago

Hi, unfortunately I don't know anything about SyncNet or how it's related to Sitcom Simulator, could you elaborate?