v-iashin / SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
https://v-iashin.github.io/SpecVQGAN
MIT License
347 stars 40 forks source link

Evaluation Results #35

Open aselimc opened 1 year ago

aselimc commented 1 year ago

Hello,

I've been trying to reproduce the results with pretrained models that are provided in this repository for VAS dataset (i.e. ResNet-50-5 features with 20.9 FID). However, the results that I've achieved are not even close the what's been reported (I got something around ~28.0). Is there any specific reason?