dr-pato / audio_visual_speech_enhancement

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
https://dr-pato.github.io/audio_visual_speech_enhancement/
Apache License 2.0
106 stars 25 forks source link

can you tell me how to get GRID dataset #3

Closed lvyilan23 closed 5 years ago

dr-pato commented 5 years ago

Hi lvyilan23, you can download the dataset at the following link: http://spandh.dcs.shef.ac.uk/gridcorpus/.

lvyilan23 commented 5 years ago

Thank you very much! I have been studying speech processing before, and now I want to transfer to the multi-modal field of audio-visual. Could you please recommend me some classic papers?

------------------ 原始邮件 ------------------ 发件人: "notifications"notifications@github.com; 发送时间: 2019年7月15日(星期一) 晚上9:14 收件人: "dr-pato/audio_visual_speech_enhancement"audio_visual_speech_enhancement@noreply.github.com; 抄送: "lyl"254467442@qq.com;"Author"author@noreply.github.com; 主题: Re: [dr-pato/audio_visual_speech_enhancement] can you tell me how toget GRID dataset (#3)

Hi lvyilan23, you can download the dataset at the following link: http://spandh.dcs.shef.ac.uk/gridcorpus/.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

dr-pato commented 5 years ago

Hi, here is a selection of papers about audio-visual speech processing:

Audio-visual speech recognition

Audio-visual speech separation

Audio-visual speech enhancement

I hope this helps.

lvyilan23 commented 5 years ago

Thank you sooooo much!!!!!

------------------ 原始邮件 ------------------ 发件人: "notifications"notifications@github.com; 发送时间: 2019年7月25日(星期四) 晚上9:10 收件人: "dr-pato/audio_visual_speech_enhancement"audio_visual_speech_enhancement@noreply.github.com; 抄送: "lyl"254467442@qq.com;"Author"author@noreply.github.com; 主题: Re: [dr-pato/audio_visual_speech_enhancement] can you tell me how toget GRID dataset (#3)

Hi, here is a selection of papers about audio-visual speech processing:

Audio-visual speech recognition

J. S. Chung, A. Senior, O. Vinyals, and A. Zisserman, “Lip reading sentences in the wild,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017, pp. 3444–3453.

T. Afouras, J. S. Chung, A. Senior, O. Vinyals, and A. Zisserman, “Deep audio-visual speech recognition,” IEEE transactions on pattern analysis and machine intelligence, 2018.

Audio-visual speech separation

A. Ephrat, I. Mosseri, O. Lang, T. Dekel, K. Wilson, A. Hassidim, W. T. Freeman, and M. Rubinstein, “Looking to Listen at the Cocktail Party: Speaker-Independent Audio-Visual Model for Speech Separation,” ACM Transactions on Graphics, vol. 37, no. 4, pp. 1–11, Jul. 2018,

T. Afouras, J. S. Chung, and A. Zisserman, “The conversation: Deep audio-visual speech enhancement,” in Interspeech, 2018.

A. Owens and A. A. Efros, “Audio-visual scene analysis with self-supervised multisensory features,” European Conference on Computer Vision (ECCV), 2018.

Audio-visual speech enhancement

A. Gabbay, A. Shamir, and S. Peleg, “Visual speech enhancement,” in Interspeech. ISCA, 2018, pp. 1170– 1174.

D. Michelsanti, Z.-H. Tan, S. Sigurdsson, and J. Jensen, “On training targets and objective functions for deeplearning- based audio-visual speech enhancement,” in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019, pp. 8077–8081.

I hope this helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.