guyyariv / TempoTokens

This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
https://pages.cs.huji.ac.il/adiyoss-lab/TempoTokens/
MIT License
101 stars 10 forks source link

Can I use AV-Align to asses video-to-audio generation? #8

Closed BingliangLi closed 1 month ago

BingliangLi commented 1 month ago

Hi, I thinks the av-align score is a brilliant idea, I would like to ask does it make sense to asses V2A instead of A2V?

guyyariv commented 1 month ago

Hey, thank you! That makes sense to me, and a recent study has already addressed this (see https://arxiv.org/abs/2407.07464).

BingliangLi commented 1 month ago

Thanks!