andabi / music-source-separation

Deep neural networks for separating singing voice from music written in TensorFlow
795 stars 150 forks source link

loss graph #7

Open balenko1992 opened 6 years ago

balenko1992 commented 6 years ago

have you set 100k epochs for the train execution?

our loss graph using mir-1k dataset on 20k epochs doesn't show the correct behavior, expected to be a decreasing curve. can you post your loss 21744830_10212070195602462_149376697_o graph result?

andabi commented 6 years ago

@vrosato The loss graph I've got was a lot like that of yours. When you set the smooth rate to 0.99, you probably see a decreasing loss graph. Actually I have no good explanation about why the graph is so oscillating. BTW If you have any chance, I recommend you to try L-BFGS optimization method. And then please let me know what the result become ;)

balenko1992 commented 6 years ago

thx for the answer. we set the smooth rate to 0.99 and this is the result. have you used ikala or mir-1k dataset in the training process with 100k epochs?

21744820_10212077225818213_1620220579_o

andabi commented 6 years ago

@vrosato I used only ikala (was better than mixing two) and I remember that over 20k steps training was enough to get a generalized model. I hope you get a right result.

balenko1992 commented 6 years ago

we are training the network with mir-1k dataset. after that we try with ikala dataset and the we compare the results.

for the optimizer L-BFGS we found this https://github.com/midori1/pylbfgs . is it the correct implementation?

thx

balenko1992 commented 6 years ago

why GNSDR, GSIR, GSAR assume the same values? the paper expected results are different

balenko1992 commented 6 years ago

did you separe the dataset for train and eval? using 175 clip for training and 825 for eval like the paper [3]

andabi commented 6 years ago

@balenko1992 yes i did, but the amount of each is not the same like that in the paper.

leimao commented 6 years ago

Dear andabi,

I am trying to reproduce the work you have done. I read your code and I am writing on my own using the same neural network architecture. Although the code of mine looks quite different to yours, but I believe the core training part are the almost the same. I was using y_tilde_src1 = y_hat_src1 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed y_tilde_src2 = y_hat_src2 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed as the output of neural network and apply them to the MSE loss function: loss = tf.reduce_mean(tf.square(self.y_src1 - self.y_pred_src1) + tf.square(self.y_src2 - self.y_pred_src2), name = 'MSE_loss') During training, I could see training loss drop from initial 8-9 to 4 in 500 time steps (i.e. 500 mini-batches), however, after another 20K time steps, the training loss is still around 3-4. So I came here to read your post.

I wonder what is the smooth rate you guys talked above? What is the definition of epoch here (Is it number of mini-batches)? Thank you.

cheriylan commented 3 years ago

why GNSDR, GSIR, GSAR assume the same values? the paper expected results are different

How do you see the GNSDR,GSIR,GSAR, I run eval.py,but I only get separation result without GNSDR,GSIR,GSAR ,How can I get it.