alibaba / Alibaba-MIT-Speech

Alibaba speech technology
913 stars 251 forks source link

what if i have no gpu, how long it will take to train this model in kaldi #11

Open AlexPeng19 opened 6 years ago

AlexPeng19 commented 6 years ago

This script is intended to be used with GPUs but you have not compiled Kaldi with CUDA If you want to use GPUs (and have them), go to src/, and configure and make on a machine where "nvcc" is installed.

i see the warning, maybe it will not block the trainning, but could i know how to shorten the training period if there is no gpu. i think my machine is well configured, it has 256G memory and 26 processor, but after two weeks training, it only complet half of the run.sh script. anybody could provide help?

tramphero commented 6 years ago

Generally speaking, a single GPU is dozens of times faster than CPU. So I am afraid it will take months for you to train this model using CPUs.

AlexPeng19 commented 6 years ago

@tramphero could i ask another question, while i am running librispeech/s5/run.sh. there is message as followings:

"This script is intended to be used with GPUs but you have not compiled Kaldi with CUDA If you want to use GPUs (and have them), go to src/, and configure and make on a machine where "nvcc" is installed."

after the command: steps/align_fmllr.sh --nj 30 --cmd "$train_cmd" \ 240 data/train_clean_100 data/lang exp/tri4b exp/tri4b_ali_clean_100

and it exit without any errors each time, does it mean i need to make some change on somewhere? looking forward your answer.

AlexPeng19 commented 6 years ago

@tramphero i see, i checked local/nnet2/run_5a_clean_100.sh, i reset use_gpu=false, now it moved on.

AlexPeng19 commented 6 years ago

i used the gridengine parallelism configuration with 24 thread to run, any possibility to shorten the period. i intended to run with multiple nodes, but it has error to find master node path, so i have to disable other nodes. did you ever come across this kind of problem?

AlexPeng19 commented 6 years ago

@tramphero