Closed oplatek closed 4 years ago
Hi @oplatek,
Really I built this image only with CUDA backend. To build it with CPU backend and reproduce everything from the paper (not to reproduce training of language models with fairseq) you should modify the Dockerfile https://github.com/facebookresearch/wav2letter/blob/master/recipes/models/lexicon_free/Dockerfile to the following:
FROM wav2letter/wav2letter:cpu-base-26c69be
# ==================================================================
# flashlight https://github.com/facebookresearch/flashlight.git
# ------------------------------------------------------------------
RUN cd /root && git clone --recursive https://github.com/facebookresearch/flashlight.git && \
cd /root/flashlight && git checkout da99018f393c9301c9bb50908dabde954b290256 && \
git submodule update --init --recursive && mkdir -p build && \
export MKLROOT=/opt/intel/mkl && \
cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DFLASHLIGHT_BACKEND=CPU && \
make -j8 && make install && \
# ==================================================================
# kenlm rebuild with max order 20 and install python wrapper
# ------------------------------------------------------------------
cd /root/kenlm/build && \
cmake .. -DKENLM_MAX_ORDER=20 && make -j8 && make install && \
cd /root/kenlm && \
sed -i 's/DKENLM_MAX_ORDER=6/DKENLM_MAX_ORDER=20/g' setup.py && \
pip install . && \
# ==================================================================
# wav2letter with CPU backend
# ------------------------------------------------------------------
cd /root && git clone --recursive https://github.com/facebookresearch/wav2letter.git && \
export KENLM_ROOT_DIR=/root/kenlm && \
cd /root/wav2letter && git checkout 9bf4538 && mkdir -p build && cd build && \
cmake .. -DCMAKE_BUILD_TYPE=Release -DW2L_LIBRARIES_USE_CUDA=OFF -DKENLM_MAX_ORDER=20 && \
make -j8 && \
# ==================================================================
# sph2pipe
# ------------------------------------------------------------------
cd /root && wget https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/ctools/sph2pipe_v2.5.tar.gz && \
tar -xzf sph2pipe_v2.5.tar.gz && cd sph2pipe_v2.5 && \
gcc -o sph2pipe *.c -lm
Also you can use regular CPU docker image version but please checkout flashlight to the da99018f393c9301c9bb50908dabde954b290256, w2l to the 9bf4538, and build with MAX_KENLM_ORDER=20
(you can check above how this can be done)
Let me know if this works for you!
Hi @tlikhomanenko,
thank you for your help!
The docker image helped me launch the decoding but I still cannot decode on CPU with convLM.
I run this command with the config files and data prepared. I followed the README.
docker run --mount src=$(pwd)/model,target=/root/model,type=bind --mount src=$(pwd)/data,target=/root/data,type=bind --rm --ipc=host --name lexfree-decoding wav2letter/wav2letter:lexfree-cpu \
/root/wav2letter/build/Decoder \
--flagsfile /root/data/${USER}-decoder_char_convlm_lexfree.cfg \
--minloglevel=0 \
--logtostderr=1
which results into the error
...
...
I1213 10:43:40.525741 1 Decode.cpp:92] [Criterion] AutoSegmentationCriterion
I1213 10:43:40.525761 1 Decode.cpp:94] [Network] Number of params: 10116412
I1213 10:43:40.525770 1 Decode.cpp:100] [Network] Updating flags from config file: /root/model/am/baseline_nov93dev.bin
I1213 10:43:40.526813 1 Decode.cpp:112] Gflags after parsing
--flagfile=; --fromenv=; --tryfromenv=; --undefok=; --tab_completion_columns=80; --tab_completion_word=; --help=false; --helpfull=false; --helpmatch=; --helpon=; --helppackage=false; --helpshort=false; --helpxml=false; --version=false; --adambeta1=0.90000000000000002; --adambeta2=0.999; --am=/root/model/am/baseline_nov93dev.bin; --arch=vital-half-kwinc-kwb13-kwe21_cpp_wn-s2; --archdir=/mnt/vol/gfsai-east/ai-group/users/locronan/wsj++/arch; --attention=content; --attentionthreshold=0; --attnWindow=no; --attnconvchannel=0; --attnconvkernel=0; --attndim=0; --batchsize=16; --beamsize=500; --beamthreshold=25; --blobdata=false; --channels=1; --criterion=asg; --critoptim=sgd; --datadir=/root/data/lists/; --dataorder=input; --decoderattnround=1; --decoderdropout=0; --decoderrnnlayer=1; --decodertype=tkn; --devwin=0; --emission_dir=; --enable_distributed=false; --encoderdim=0; --eostoken=false; --everstoredb=false; --fftcachesize=1; --filterbanks=80; --flagsfile=/root/data/oplatek-decoder_char_convlm_lexfree.cfg; --framesizems=25; --framestridems=10; --gamma=1; --gumbeltemperature=1; --hardselection=1; --input=flac; --inputbinsize=100; --inputfeeding=false; --iter=1000000; --itersave=false; --labelsmooth=0; --leftWindowSize=50; --lexicon=/root/model/decoder/lexicon.lst; --linlr=-1; --linlrcrit=-1; --linseg=1; --lm=/root/model/decoder/convlm_models/lm_wsj_convlm_char_20B.bin; --lm_memory=3000; --lm_vocab=/root/model/decoder/convlm_models/lm_wsj_convlm_char_20B.vocab; --lmtype=convlm; --lmweight=1.7510731428777175; --localnrmlleftctx=0; --localnrmlrightctx=0; --logadd=false; --lr=5.5999999999999996; --lrcrit=0.0080000000000000002; --maxdecoderoutputlen=200; --maxgradnorm=0.050000000000000003; --maxisz=9223372036854775807; --maxload=-1; --maxrate=10; --maxsil=50; --maxtsz=9223372036854775807; --maxword=-1; --melfloor=1; --memstepsize=10485760; --mfcc=false; --mfcccoeffs=13; --mfsc=true; --minisz=0; --minrate=3; --minsil=0; --mintsz=0; --momentum=0; --netoptim=sgd; --noresample=false; --nthread=6; --nthread_decoder=2; --onorm=target; --optimepsilon=1e-08; --optimrho=0.90000000000000002; --outputbinsize=5; --pctteacherforcing=100; --pcttraineval=100; --pow=false; --pretrainWindow=0; --replabel=2; --reportiters=0; --rightWindowSize=50; --rndv_filepath=; --rundir=/mnt/vol/gfsai-east/ai-group/users/locronan/wsj++/runs/baseline-variants/chronos; --runname=baseline_lr5.6_lrcrit0.008_fb80_bsz16_archs2; --samplerate=16000; --sampletarget=0; --samplingstrategy=rand; --sclite=/root/data/sclite; --seed=0; --show=true; --showletters=true; --silweight=-2.3696139062587926; --smearing=max; --smoothingtemperature=1; --softselection=inf; --softwoffset=10; --softwrate=5; --softwstd=5; --sqnorm=true; --stepsize=1000000; --surround=|; --tag=; --target=ltr; --test=nov92.lst; --tokens=tokens.lst; --tokensdir=/root/model/am; --train=si284; --trainWithWindow=false; --transdiag=5; --unkweight=-inf; --uselexicon=false; --usewordpiece=false; --valid=nov93dev,nov92; --weightdecay=0; --wordscore=2.9346358245918216; --wordseparator=|; --world_rank=0; --world_size=1; --alsologtoemail=; --alsologtostderr=false; --colorlogtostderr=false; --drop_log_memory=true; --log_backtrace_at=; --log_dir=; --log_link=; --log_prefix=true; --logbuflevel=0; --logbufsecs=30; --logemaillevel=999; --logmailer=/bin/mail; --logtostderr=true; --max_log_size=1800; --minloglevel=0; --stderrthreshold=5; --stop_logging_if_full_disk=false; --symbolize_stacktrace=true; --v=0; --vmodule=;
I1213 10:43:40.526911 1 Decode.cpp:133] Number of classes (network): 31
I1213 10:43:40.898849 1 Decode.cpp:140] Number of words: 162533
Falling back to using letters as targets for the unknown word: martirosov
I1213 10:43:41.133837 1 W2lListFilesDataset.cpp:137] 333 files found.
I1213 10:43:41.133868 1 Utils.cpp:102] Filtered 0/333 samples
I1213 10:43:41.133918 1 W2lListFilesDataset.cpp:62] Total batches (i.e. iters): 333
I1213 10:43:41.133985 1 Decode.cpp:154] [Serialization] Running forward pass ...
Falling back to using letters as targets for the unknown word: martirosov
Skipping unknown entry: 'martirosov'
I1213 10:44:26.121348 1 Decode.cpp:201] [Dataset] Number of samples per thread: 167
I1213 10:44:26.205636 1 Decode.cpp:292] [ConvLM]: Loading LM from /root/model/decoder/convlm_models/lm_wsj_convlm_char_20B.bin
[ConvLM]: Loading vocabulary from /root/model/decoder/convlm_models/lm_wsj_convlm_char_20B.vocab
[ConvLM]: vocabulary size of convLM 40
I1213 10:44:37.671041 1 Decode.cpp:308] [Decoder] LM constructed.
F1213 10:44:37.671173 46 Decode.cpp:357] FLAGS_nthread_decoder exceeds the number of visible GPUs
*** Check failure stack trace: ***
I1213 10:44:37.671195 47 Decode.cpp:430] [Decoder] Lexicon-free decoder with token-LM loaded in thread: 0
@ 0x7f13fc3285cd google::LogMessage::Fail()
@ 0x7f13fc32a433 google::LogMessage::SendToLog()
@ 0x7f13fc32815b google::LogMessage::Flush()
@ 0x7f13fc32ae1e google::LogMessageFatal::~LogMessageFatal()
@ 0x47dbac _ZZ4mainENKUliiiE2_clEiii
@ 0x47e9aa _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_ESt12_Bind_simpleIFSt17reference_wrapperISt5_BindIFZ4mainEUliiiE2_iiiEEEvEEvEEE9_M_invokeERKSt9_Any_data
@ 0x483f69 std::__future_base::_State_baseV2::_M_do_set()
@ 0x7f13fc55aa99 __pthread_once_slow
@ 0x47af31 _ZNSt13__future_base11_Task_stateISt5_BindIFZ4mainEUliiiE2_iiiEESaIiEFvvEE6_M_runEv
@ 0x488c8b _ZNSt6thread5_ImplISt12_Bind_simpleIFZN2fl10ThreadPoolC4EmRKSt8functionIFvmEEEUlvE_vEEE6_M_runEv
@ 0x7f13fc053c80 (unknown)
@ 0x7f13fc5536ba start_thread
@ 0x7f13fb7b941d clone
@ (nil) (unknown)
*** Aborted at 1576233877 (unix time) try "date -d @1576233877" if you are using GNU date ***
PC: @ 0x7f13fb6e9196 abort
*** SIGSEGV (@0x0) received by PID 1 (TID 0x7f13cd3fa700) from PID 0; stack trace: ***
@ 0x7f13fc55d390 (unknown)
@ 0x7f13fb6e9196 abort
@ 0x7f13fc33112c (unknown)
@ 0x7f13fc3285cd google::LogMessage::Fail()
@ 0x7f13fc32a433 google::LogMessage::SendToLog()
@ 0x7f13fc32815b google::LogMessage::Flush()
@ 0x7f13fc32ae1e google::LogMessageFatal::~LogMessageFatal()
@ 0x47dbac _ZZ4mainENKUliiiE2_clEiii
@ 0x47e9aa _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_ESt12_Bind_simpleIFSt17reference_wrapperISt5_BindIFZ4mainEUliiiE2_iiiEEEvEEvEEE9_M_invokeERKSt9_Any_data
@ 0x483f69 std::__future_base::_State_baseV2::_M_do_set()
@ 0x7f13fc55aa99 __pthread_once_slow
@ 0x47af31 _ZNSt13__future_base11_Task_stateISt5_BindIFZ4mainEUliiiE2_iiiEESaIiEFvvEE6_M_runEv
@ 0x488c8b _ZNSt6thread5_ImplISt12_Bind_simpleIFZN2fl10ThreadPoolC4EmRKSt8functionIFvmEEEUlvE_vEEE6_M_runEv
@ 0x7f13fc053c80 (unknown)
@ 0x7f13fc5536ba start_thread
@ 0x7f13fb7b941d clone
@ 0x0 (unknown)
Based on the error it seems that the decoding with convlm
assumes GPU
See https://github.com/facebookresearch/wav2letter/blob/master/Decode.cpp#L357
Is the code for convlm
"heavily" GPU dependent or it would be just too slow?
(I would not mind a huge slow down right now and editing few lines of cpp code.)
PS: If I use decoder_char_15gram_lexfree.cfg
or decoder_char_20gram_lexfree.cfg
the decoding runs fine. IE the kenlm runs fine if it is used instead of convlm.
@oplatek,
Right now we tested convlm only on GPU and in the implementation we are running ConvLM on the GPU only. You can try to remove the check here https://github.com/facebookresearch/wav2letter/blob/master/Decode.cpp#L362 (which is more reliable for GPUs). Let me know if this will work for you.
I am closing the issue for now. Feel free to reopen if it is needed.
What went well: I used the pre-trained models, the docker image
wav2letter/wav2letter:lexfree
I successfully launch the decoding past the point it loads the models but it seems thewav2letter/wav2letter:lexfree
image assumes nvidia-docker, right?What went wrong: I used regular docker version 18.09.4. And I omitted the
--runtime=cuda
from the commandIt resulted in error:
How can I the decode with the lexfree model using regular Docker on CPU?
Do I need to change the dockerfile to build its CPU version? If yes can you hint what is need to change in the https://github.com/facebookresearch/wav2letter/blob/master/recipes/models/lexicon_free/Dockerfile ? I assume it is the Dockerfile used to build
wav2letter/wav2letter:lexfree
image.