Open akfalc opened 6 years ago
Hi,
You can start debugging by running Merlin until the NORMCMP step and check if the .cmp files are correctly generated and placed in '/home/akfalc/merlin/egs/voice_conversion/s1/experiments/jmk2ksp/acoustic_model/inter_module/nn_no_silence_lab_norm_127/'
Hi,
Actually, it is creating cmp files and placed them in /home/akfalc/merlin/egs/voice_conversion/s1/experiments/jmk2ksp/acoustic_model/inter_module/nn_cc_fv_fo_127
So I don't understand why this files are asked from that directory.
OK, I see the problem.
Check the script ./scripts/create_symbolic_link.sh
(6th step in the vc recipe). It should create the nn_no_silence_lab_norm_127/ folder as a symbolic link pointing to nn_cc_fv_fo_127/. In that way, the files will be automatically redirected.
Hi, Thank you for the help.
It seems like above problem has been solved but there is another error now about dimension of the matrices. Here is the created error:
2018-04-21 09:22:27,154 INFO main : label dimension is 127
2018-04-21 09:22:27,155 INFO main : training DNN
2018-04-21 09:22:27,931 INFO main.train_DNN: building the model
2018-04-21 09:23:24,013 INFO main.train_DNN: fine-tuning the DNN model
2018-04-21 09:24:21,418 INFO main.train_DNN: epoch 1, validation error nan, train error nan time spent 57.40
2018-04-21 09:24:21,418 INFO main.train_DNN: overall training time: 0.96m validation error 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
2018-04-21 09:24:21,426 INFO main : generating from DNN
2018-04-21 09:24:21,733 INFO dnn_generation: generating 1 of 132: /home/akfalc/merlin/egs/voice_conversion/s1/experiments/jmk2ksp/acoustic_model/inter_module/nn_no_silence_lab_norm_127/arctic_b0408.cmp
Traceback (most recent call last):
File "/home/akfalc/merlin/src/run_merlin.py", line 1377, in
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
I have changed all the related dimension info based on my data's dimension but still it gives an error.
From your log, I figure that the model was never trained because the learning rate was too high, thus can not generate features. Try lowering your learning rate in your configurations (say half the value, or even smaller to be safe) and see if that works. For future reference, if you ever get a validation error that is this big again, it is probably caused by a learning rate that is too high. Usually the error will be in between 100 and 1000 in most cases.
Another suggestion: Check that there is no any NaN or inf in your features.
Hi, another error appears:
2018-04-24 13:34:37,556 INFO main : label dimension is 127
2018-04-24 13:34:37,556 INFO main : training DNN
2018-04-24 13:34:37,709 INFO main.train_DNN: building the model
2018-04-24 13:34:48,043 INFO main.train_DNN: fine-tuning the DNN model
2018-04-24 13:47:43,274 INFO main.train_DNN: epoch 1, validation error nan, train error nan time spent 775.23
2018-04-24 13:47:43,274 INFO main.train_DNN: overall training time: 12.92m validation error 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
2018-04-24 13:47:43,279 INFO main : generating from DNN
2018-04-24 13:47:45,697 INFO dnn_generation: generating 1 of 132: /home/akfalc/merlin/egs/voice_conversion/s1/experiments/jmk2ksp/acoustic_model/inter_module/nn_no_silence_lab_norm_127/arctic_b0408.cmp
Traceback (most recent call last):
File "/home/akfalc/merlin/src/run_merlin.py", line 1378, in
Above problem arise from having earlier set model now error has been changed as I used another dataset. here is the error :
2018-04-24 15:09:25,732 INFO main.train_DNN: epoch 1, validation error nan, train error nan time spent 66.42
2018-04-24 15:09:25,732 INFO main.train_DNN: overall training time: 1.11m validation error 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
2018-04-24 15:09:25,739 INFO main : generating from DNN
Traceback (most recent call last):
File "/home/akfalc/merlin/src/run_merlin.py", line 1378, in
at some place it has to create model but because of some unknown problem it fails to create it. what can be the reason for this ? Thanks in advance.
Here I met again another error, I don't know either it is because of config file or python script.
CRITICAL main : train_DNN threw an exception Traceback (most recent call last): File "/home/akfalc/merlin/src/run_merlin.py", line 1378, in
main_function(cfg)
File "/home/akfalc/merlin/src/run_merlin.py", line 865, in main_function
cmp_mean_vector = cmp_mean_vector, cmp_std_vector = cmp_std_vector,init_dnn_model_file=cfg.start_from_trained_model)
File "/home/akfalc/merlin/src/run_merlin.py", line 222, in train_DNN
shared_train_set_xy, temp_train_set_x, temp_train_set_y = train_data_reader.load_one_partition()
File "/home/akfalc/merlin/src/utils/providers.py", line 296, in load_one_partition
shared_set_xy, temp_set_x, temp_set_y = self.load_next_partition()
File "/home/akfalc/merlin/src/utils/providers.py", line 751, in load_next_partition
in_features, lab_frame_number = io_fun.load_binary_file_frame(self.x_files_list[self.file_index], self.n_ins)
File "/home/akfalc/merlin/src/io_funcs/binary_io.py", line 64, in load_binary_file_frame
fid_lab = open(file_name, 'rb')
IOError: [Errno 2] No such file or directory: '/home/akfalc/merlin/egs/voice_conversion/s1/experiments/jmk2ksp/acoustic_model/inter_module/nn_no_silence_lab_norm_127/arctic_a0043.cmp'
Lock freed
Thanks everyone who contribute.