luosiallen / Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Apache License 2.0
147 stars 15 forks source link

about evaluation #5

Closed Yusiissy closed 8 months ago

Yusiissy commented 10 months ago

Hi, thanks for your great open-source work! I noticed that your work generates audio at 16000hz, while SpecVQGAN generates audio at 22050hz, can I know how you process the data when evaluating the metrics? It is known that SpecVQGAN work is calculated when the audio is all 22050hz, can you provide the code for the evaluation? I am looking forward to your reply!

luosiallen commented 8 months ago

Please refer to evaluation/transform_spec.py