Closed lucasdobr15 closed 3 years ago
Hi @lucasdobr15,
which function have you called for separation?
Did you call model.separate_track
like below?
from lasaft.pretrained import PreTrainedLaSAFTNet
model = PreTrainedLaSAFTNet(model_name='lasaft_large_2020')
vocals = model.separate_track(audio, 'vocals')
drums = model.separate_track(audio, 'drums')
bass = model.separate_track(audio, 'bass')
other = model.separate_track(audio, 'other')
Then please check you set the pretrained model to be in the cuda mode.
model = model.cuda()
If it does not work, then please share the code script you have used.
This is the source code of the script: 4stems.py
import os import numpy as np import soundfile as sf from lasaft.pretrained import PreTrainedLaSAFTNet
model = PreTrainedLaSAFTNet(model_name='lasaft_large_2021')
audio, fs = sf.read('test.wav') number_samples, number_channels = np.shape(audio)
.# audio should be an np(numpy) array of an stereo audio track .# with dtype of float32 .# shape must be (T, 2)
vocals2 = model.separate_track(audio, 'vocals') vocals = vocals2[0:number_samples]
drums2 = model.separate_track(audio, 'drums') drums = drums2[0:number_samples]
bass2 = model.separate_track(audio, 'bass') bass = bass2[0:number_samples]
other2 = model.separate_track(audio, 'other') other = other2[0:number_samples]
sf.write('test_vocals.wav', vocals, fs) sf.write('test_drums.wav', drums, fs) sf.write('test_bass.wav', bass, fs) sf.write('test_other.wav', other, fs)
os.remove("temp.wav")
What do you suggest?
please try below
model = PreTrainedLaSAFTNet(model_name='lasaft_large_2021').cuda()
Thank you so much I just added this code and it worked <3
We just have to thank you for having given us this beautiful project!
A doubt
Why does the end result have "lags" in the songs?
You welcome :) and, what kind of lags do you mean? Can you share some sample?
ORIGINAL (SAMPLE 37 SECONDS) https://www.youtube.com/watch?v=9TNyueKk2Nw LASAFT (SAMPLE 37 SECONDS) https://www.youtube.com/watch?v=wFB3SR29WTI
why that kind of problem? :(
Thank you for sharing :) Is "lag" you mentioned something like 0:16~0:18 in https://www.youtube.com/watch?v=wFB3SR29WTI ?
Yes, exactly
It's this kind of 'lag' I'm talking about
And another thing LASAFT thinks these instruments are voices :/ (GUITAR SOLO / FLUTE / BLOWING INSTRUMENTS AND MAINLY (ORGAN INSTRUMENT)
Do you have any suggestions to correct these 2 problems?
Wow, your analysis is perfect. We also have suspected that our models are hard to distinguish singing voice from instruments you mentioned. It might be because the training dataset (MUSDB18) only contains four groups of instruments (vocals, drum, bass, and other), so there are no explicit cases to make the model distinguish them from singing voice. Possible solutions might be quite technical, and we've been designing solutions for it.
Below is the list of possible solutions
We have been working on this new project and will publish it after submitting a new paper if it produces a better result :)
Hello
I wonder if LASAFT is able to separate music using GPU? (I'm not referring to model training)
Because I already did everything here, I installed CUDA, cuDNN, TensorFlow-GPU and even so LASAFT insists on using only CPU to separate the songs :(
I await feedback, thank you very much: D