dr-pato / audio_visual_speech_enhancement

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
https://dr-pato.github.io/audio_visual_speech_enhancement/
Apache License 2.0
106 stars 25 forks source link

Parameter setting #24

Closed truewangxiaolong closed 3 years ago

truewangxiaolong commented 3 years ago

Hi @dr-pato There are some questions when I'm training the model. Could you please show me what parameter would be suitable in the training : --hidden_units','Number of units of BLSTM cells and '--layers', 'Number of stacked BLSTM cells' for the model 'vl2m' or 'av_concat_mask_ref'. Thank you so much. Best, Yuyue.

dr-pato commented 3 years ago

Hy @truewangxiaolong, you can find all the information about the parameters in the paper https://arxiv.org/pdf/1811.02480.pdf The number of hidden units is set to 250 for all models. The number of layers is 5 and 3 for 'vl2m' and 'av_concat_mask_ref', respectively.

Best, Giovanni

truewangxiaolong commented 3 years ago

Hi @dr-pato Oh ,I see. It's so kind of you for reply me. Thank you so much. Best, Yuyue.