ariesssxu / vta-ldm

Apache License 2.0
39 stars 2 forks source link

Question about the evaluation results #6

Closed BingliangLi closed 4 weeks ago

BingliangLi commented 1 month ago

Hi, thanks for opensource your code! I have a question about the evaluation results, in your paper Diff-foley only have 9.62 in IS score, this is far from the results in their paper, did you use the evaluation code from SpecVQGAN? In my own test their model can reach up to 45 IS score. Thanks, and good luck to your paper!

image image
ariesssxu commented 1 month ago

Hi, the evaluation code we use is mainly adopted from audioldm's evaluation code (this repo). Notably, the Inception Score (IS) we achieved is competitive with the results presented in both audioldm and tango, where the SOTA IS score hovers around 10. While I'm currently uncertain about the specific differences between the two versions of the evaluation codes, I am committed to investigating this matter further.