alibabasglab / MossFormer

This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions", which was submitted to ICASSP 2023.
Apache License 2.0
84 stars 7 forks source link

Help with training #2

Open hananbo26 opened 1 year ago

hananbo26 commented 1 year ago

Hi,

I wish to use the MossFormer to separate reverberant speech. But I am struggling in loading the WHAMR! dataset to modelscope. The example you provided uses Libri2Mix_8k dataset, which is hosted by modelscope, while WHAMR! is not hosted and seems to require different preprocessing than Libri2Mix_8k. Can you please provide some instructions on how to load the WHAMR! dataset to modelscope. Or can you please upload the model that was trained on the WHAMR! dataset?

Thanks in advance

mayccc1 commented 1 year ago

Hi, I met the same problem before. You can generate csv file from your own dataset, csv file need to include three keys(mix_wav:FILE, s1_wav:FILE, s2_wav:FILE) . and use MsDataset.load('/path/to/my_file.csv') to load your own dataset. Good Luck!

RyderZ-Neo commented 11 months ago

@mayccc1 hello . do you mean the same way as speechbrain do data preprocessing ?

thanks in advance

mayccc1 commented 11 months ago

Hi, I don't know speechbrain data preprocessing method, can't help you.SORRY

alibabasglab commented 10 months ago

Hi, if you want to try on WHAM! or WHAMR! dataset, please use our latest model 'MossFormer2' from the link: https://modelscope.cn/models/damo/speech_mossformer2_separation_temporal_8k/summary . Thanks.

alibabasglab commented 10 months ago

MossFormer2 is also described in github: https://github.com/alibabasglab/MossFormer2

Shirley-0708 commented 5 months ago

I'm wondering if I can fine-tune MossFormer on my dataset. Is this possible?