facebookresearch / svoice

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Other
1.23k stars 178 forks source link

Pre-trained models #31

Open DavidHilly opened 3 years ago

DavidHilly commented 3 years ago

When will the Pre-trained models be available?

Thanks

adamfils commented 3 years ago

I trained the model on the toy dataset for about 500 Epochs I could give you the link to my model if you want It does pretty good on some files Will have to train on a larger dataset.

egorsmkv commented 3 years ago

Sure, @adamfils , release a link to your models, please

larsh0103 commented 3 years ago

Any updates on pretrained models?

Would be great to get access.

ruchind159 commented 2 years ago

hi @adamfils, Can share the link of your trained model Please ?

JoonHong-Kim commented 2 years ago

Hi, @adamfils , Could you share the link of your model too? I need to use this model urgently.

matthewkleinmann commented 2 years ago

I too would like access to a pre trained model. I have a slow machine with only a CPU. This seems to be a recurring issue here.

bballboy8 commented 2 years ago

Ditto on being very grateful for access to that pre trained model

GeorgiyNaumenko commented 2 years ago

Hi, @adamfils, it would be great if you gave the access to the pre-trained model :)

bballboy8 commented 2 years ago

Yes it would be great if someone could share a pre-trained model they've had success with.

darkdante2209 commented 2 years ago

Hi @adamfils , could you please share your pretrained model, thanks alot

desperado1999 commented 2 years ago

Hi @adamfils could you please share your pretrained model, thanks very much

muhammad-ahmed-ghani commented 1 year ago

@darkdante2209 @desperado1999 Hi I trained model on open source librimix dataset and I hope if you just want the demonstration for academic research purpose than it might help you. As I don't have much resources to train, I trained it on just 31 epochs. Accuracy is still bad 0.04 train and 0.64 valid loss. It is underfitting but I don't have enough time to look at it. At least you can test it out.

svoice_demo

jain11vaibhav commented 1 year ago

Hi, @adamfils , Could you please share the link of your model too? I need to do the inference for the data I have.

desperado1999 commented 1 year ago

@darkdante2209 @desperado1999 Hi I trained model on open source librimix dataset and I hope if you just want the demonstration for academic research purpose than it might help you. As I don't have much resources to train, it took 2 weeks for just 19 epochs on Gtx 1070. Accuracy is still bad 0.6 train loss. svoice_demo

WoW! Thanks very much! Recently I am trying to train svoice on another dataset similar to LibriMix ( the sources in each mixed file are from the same speaker), and I think your pretrianed model could significantly help me reduce my training time.

adamfils commented 1 year ago

Hi Guys, Sorry for the delayed response. Here is a link to the model. https://drive.google.com/file/d/1F6ll_HyyrhjWjC3J74kBdT--4a9hPMX6/view?usp=sharing

@egorsmkv @jain11vaibhav @desperado1999 @darkdante2209 @GeorgiyNaumenko @JoonHong-Kim @ruchind159

qalabeabbas49 commented 1 year ago

hye @adamfils, is this a 2-speaker separation model? On which data did you train it?