Open calicratis19 opened 5 years ago
See https://keras.io/models/model/ By default, Keras will shuffle the data for you when training.
Thanks very much for the quick reply. I'm using a generator because the data is too big(17GB) for my ram to load. So default shuffling option is not applied to my case. I do not set steps_per_epoch to none which results in no shuffling in my case by keras.
If I shuffle the rows randomly will it be alright? Is there any restriction on the shuffling?
Normally you should have 2 pieces for training data: The feature-vectors as input and the desired output of the model for each vector. You need to make sure, that you shuffle both of them in the same way.
But then - I think - shuffling should be fine (and the right thing to do).
@calicratis19 @jmvalin ,I want to train the model using my data .Where can I find the code to generate the input data?Thanks!
@wuqiangch
Steps:
(1) cd src ; ./compile.sh
(2) ./denoise_training signal.raw noise.raw count > training.f32
(note the matrix size and replace 50000000 87 below)
(3) cd training ; ./bin2hdf5.py ../src/training.f32 50000000 87 training.h5
(4) ./rnn_train.py
(5) ./dump_rnn.py weights.hdf5 rnn_data.c rnn_data.rnnn name
@calicratis19 Dear sir. what shoud I do to prepare the signal.raw and noise.raw,. how much word did you do on the datas that‘s jmvalin operatored to make it match this step, thank you !
@calicratis19 I mean ,how can i prepare the signal.raw and noise.raw from pcm or wav datas.
@carclodefly you need to merge all your noise audio files to a single file and all noiseless audio files to another single file. Then convert them to pcm format and name them noise.raw and signal.raw.
@carclodefly you need to merge all your noise audio files to a single file and all noiseless audio files to another single file. Then convert them to pcm format and name them noise.raw and signal.raw.
thank you very much !
@calicratis19 and,Sir. if I wanna train my model with the noise datas that Mr. jmvalin provided, what kind of signal none noise data I can use ?and how much signal data I need ?
@carclodefly you need to merge all your noise audio files to a single file and all noiseless audio files to another single file. Then convert them to pcm format and name them noise.raw and signal.raw.
Sir. I do like this with the signal form McGill TSP speech database,and the noise Mr. jmvalin offered, but the model output is not good. Do you know how to deal with the signal ?I found the feature extraction only simply add signal and noise on Frequency domain point.
@carclodefly I also could not make the model work better. My output was worse than the original one. I could not get the model decrease the loss much. So I gave up.
@wuqiangch
Steps:
(1) cd src ; ./compile.sh
(2) ./denoise_training signal.raw noise.raw count > training.f32
(note the matrix size and replace 50000000 87 below)
(3) cd training ; ./bin2hdf5.py ../src/training.f32 50000000 87 training.h5
(4) ./rnn_train.py
(5) ./dump_rnn.py weights.hdf5 rnn_data.c rnn_data.rnnn name
Step 2, you mention the third argument as count. Is this the
I am facing some issues when I tried to play around with my custom speech and noise data. I have a min of both speech and noise (16-bit, 48Khz, mono) PCM files that I am passing to denoise_training. I see 50000000 x 87 as the matrix size. But when I run the third command, I get File "./bin2hdf5.py", line 9, in
name
@sporwar-lifesize , For command "./dump_rnn.py weights.hdf5 rnn_data.c rnn_data.rnnn name", where it the stored path for "name"?
I think "python dump_rnn.py" makes more sense here, isn't it?
My training data set is a 50000000 X 87 matrix. Each iteration my model reads the data in the same way. Its recommended to shuffle training data at the beginning of each epoch so that it generalizes better. Is it possible to shuffle the training data? Will it somehow make the data invalid?