facebookresearch / VisualVoice

Audio-Visual Speech Separation with Cross-Modal Consistency
Other
218 stars 35 forks source link

Where can I get vocal_best.pth and facial_best.pth for speech enhancement model? #19

Open joannahong opened 2 years ago

joannahong commented 2 years ago

I have recently tried av-enhancement and found out that the provided pretrained models only show classifier, identity, unet, and lipreading_best.pth models. I could not find vocal_best.pth and facial_best.pth pretrained model, so I tried to used ones in the original repository, the result was not as good as what the demo video represents. Could you please add the both pretrained model or could you tell me the way to solve my problem? Thank you so much for your help.

sanjeelparekh commented 2 years ago

Hi @joannahong, were you able to generate expected av_enhancement output for the demo video using the enhancement-specific models? If so, please let me know how you managed to do that - if you changed anything etc.

joannahong commented 2 years ago

Hello @sanjeelparekh, I was not able to resolve the problem. :( Sorry to say that.

sanjeelparekh commented 2 years ago

Thanks @joannahong for your quick response. To be sure, you tried the models from this link, right? https://drive.google.com/drive/folders/1A24lu_ct7jxMgQ5PnOoOYrbo7Xx7A1fy?usp=sharing

krantiparida commented 1 year ago

@sanjeelparekh, the link shared in the above comment is empty. I know it has been a long time since. Can you please share the updated link if possible?

rhgao commented 1 year ago

Hi, sorry I lost push access to the repo and the Google Drive has also expired due to graduation. Here is the new link: https://drive.google.com/drive/folders/1fwjDu3umZBvOJf0GRHMk5RfSg7Kd4WUc?usp=sharing

krantiparida commented 1 year ago

Thanks @rhgao.