facebookresearch / svoice

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Other
1.25k stars 179 forks source link

Support for sm_86 #60

Open arikhalperin opened 2 years ago

arikhalperin commented 2 years ago

Hello, In order to support sm_86 we need to install pytorch1.90, but it breaks svoice. Is there a plan to upgrade?

Thanks, Arik

arikhalperin commented 2 years ago

I managed to work with sm_86, with minor changes to code. Thinking about creating a PR for this, these are the requirements I had to change: torch==1.7.1+cu110 torchaudio==0.7.2 torchvision==0.8.2+cu110 numpy==1.21.5

In code I had to do this(changed the call to the loss function):

    sisnr_loss, snr, est_src, reorder_est_src = cal_loss(
                    sources, estimate_source[c_idx], lengths)

I hope it helps

gashishoo7 commented 2 years ago

Thanks for sm86 support. I tried using mentioned changes but it is giving error in torchaudio.load in cross validation part after completing one epoch training.