andabi / music-source-separation

Deep neural networks for separating singing voice from music written in TensorFlow
795 stars 150 forks source link

about the unet model #48

Open ucasiggcas opened 4 years ago

ucasiggcas commented 4 years ago

hi, who has got success by using the unet model in model_unet.py and train_unet.py, any one could help me to get the unet model,sorry, it's a little difficult to understand the ## how to get the unet version ? And the unet need the RNN ???

who can give me the script in py ? thx

ucasiggcas commented 4 years ago

who could translate the model in keras???

ucasiggcas commented 4 years ago

https://github.com/andabi/music-source-separation/issues/49

jadujoel commented 4 years ago

Here's a unet model in keras style: https://www.dropbox.com/sh/v2rd5fyx1vozvmb/AACuSCNvxozcmx1Z-i4fE6Gka?dl=0 https://www.dropbox.com/sh/v2rd5fyx1vozvmb/AACuSCNvxozcmx1Z-i4fE6Gka?dl=0 Model config is inspired by https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf Hope that helps. Best / Joel

On 1 Oct 2019, at 04:29, Tychee notifications@github.com wrote:

who could translate the model in keras???

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/andabi/music-source-separation/issues/48?email_source=notifications&email_token=AFPXR7EUUUMWD47QT3VSPBDQMKYYTA5CNFSM4IZHCG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD77WS7I#issuecomment-536832381, or mute the thread https://github.com/notifications/unsubscribe-auth/AFPXR7E5EF2UFKYPO4SOVQLQMKYYTANCNFSM4IZHCG4A.

ucasiggcas commented 4 years ago

Here's a unet model in keras style: https://www.dropbox.com/sh/v2rd5fyx1vozvmb/AACuSCNvxozcmx1Z-i4fE6Gka?dl=0 https://www.dropbox.com/sh/v2rd5fyx1vozvmb/AACuSCNvxozcmx1Z-i4fE6Gka?dl=0 Model config is inspired by https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf Hope that helps. Best / Joel On 1 Oct 2019, at 04:29, Tychee @.***> wrote: who could translate the model in keras??? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#48?email_source=notifications&email_token=AFPXR7EUUUMWD47QT3VSPBDQMKYYTA5CNFSM4IZHCG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD77WS7I#issuecomment-536832381>, or mute the thread https://github.com/notifications/unsubscribe-auth/AFPXR7E5EF2UFKYPO4SOVQLQMKYYTANCNFSM4IZHCG4A.

thx will have a try.

ucasiggcas commented 4 years ago

Dear if you use the keras RNN ,will be great,for the UNET model is big. maybe 3 layers RNN and 2 dense, 2 mask layers will be the best model @jadujoel

and how to get the mask by keras ?

ucasiggcas commented 4 years ago

@jadujoel Dear, I find that in your data.py,maybe something is wrong?

    def make_dataset(self):
        self.load_audio()
        X_ffts = self.make_ffts(self.X_signals)
        X_mags = np.abs(X_ffts)
        self.X, self.X_max = self.normalize(X_mags)

        Y_ffts = self.make_ffts(self.Y_signals)
        Y_mags = np.abs(X_ffts)
        self.Y, self.Y_max = self.normalize(X_mags)

self.Y, self.Y_max = self.normalize(X_mags)

R U sure ?

ucasiggcas commented 4 years ago

And another question, fft = librosa.stft(signal, n_fft=1024, hop_length=512)[1:,:]

if U only wanna the 1:513, how to restore the signal ?

x=X_signals[0]
xstft=librosa.stft(x, n_fft=1024, hop_length=512)[1:,:]
xistft=librosa.istft(xstft,win_length=1024,hop_length=512)

it's not right!!