Closed LeeYongHyeok closed 5 years ago
Hello, @georgesterpu. Thanks for the code release. I made your av_align model the basis of my research.
In your paper, the cross-modal alignment of Video to Audio is working well. However, Audio-to-video cross-modal alignment may not work well.
So I want to see how cross-modal alignment of video-to-audio works by using both audio-video and video-to-audio simultaneously.
So I checked the structure of the AttentiveEncoder class in your code avsr / encoder.py.
I found that the AttentiveEncoder uses the normal video encoder's output and audio data to create a video-to-audio AttentiveEncoder at once.
I would like to have video-to-audio and audio-to-video at the same time, but I think this is not possible with the current code structure.
Which part do I need to modify so that I can use your AttentiveEncoder at the same time?
I am very pleased and thank you for doing research in the same research field.
Sincery, YongHyeok Lee.
Hello, @georgesterpu. Thanks for the code release. I made your av_align model the basis of my research.
In your paper, the cross-modal alignment of Video to Audio is working well. However, Audio-to-video cross-modal alignment may not work well.
So I want to see how cross-modal alignment of video-to-audio works by using both audio-video and video-to-audio simultaneously.
So I checked the structure of the AttentiveEncoder class in your code avsr / encoder.py.
I found that the AttentiveEncoder uses the normal video encoder's output and audio data to create a video-to-audio AttentiveEncoder at once.
I would like to have video-to-audio and audio-to-video at the same time, but I think this is not possible with the current code structure.
Which part do I need to modify so that I can use your AttentiveEncoder at the same time?
I am very pleased and thank you for doing research in the same research field.
Sincery, YongHyeok Lee.