Getting a coredump while running the decoder with the language model.

vchagari commented 3 years ago

Bug Description

Note: If I don't give provide the language model to the decoder, it runs fine.

Commit-id's: Wave2letter: commit b1d1f89f586120a978a4666cffd45c55f0a2e564 (HEAD -> v0.2, origin/v0.2) Kenlm: commit e47088ddfae810a5ee4c8a9923b5f8071bed1ae8 (HEAD) Flashlight: commit e62eb7ea4c9381411508c08226598ba11cbf9511 (HEAD -> v0.2, origin/v0.2) ArraryFire: commit d9d9b6584029e0b480875cdcf35f3238d43ac0e0 (HEAD, tag: v3.7.1)

Steps/Details: Command: ./Decoder --flagsfile /data/scripts/decode.cfg --lm /data/lm_04.bin --lmweight=0.5515838301157 --wordscore=0.52526055643809 I0209 12:34:58.996706 30329 Decode.cpp:106] Gflags after parsing --flagfile=; --fromenv=; --tryfromenv=; --undefok=; --tab_completion_columns=80; --tab_completion_word=; --help=false; --helpfull=false; --helpmatch=; --helpon=; --helppackage=false; --helpshort=false; --helpxml=false; --version=false; --adambeta1=0.90000000000000002; --adambeta2=0.999; --am=/data/02_04_2021/inference_2019/results/001_model_iter_091.bin; --am_decoder_tr_dropout=0; --am_decoder_tr_layerdrop=0; --am_decoder_tr_layers=1; --arch=/data/am_500ms_future_context.arch; --archdir=; --attention=content; --attentionthreshold=0; --attnWindow=no; --attnconvchannel=0; --attnconvkernel=0; --attndim=0; --batchsize=8; --beamsize=500; --beamsizetoken=100; --beamthreshold=100; --blobdata=false; --channels=1; --criterion=ctc; --critoptim=sgd; --datadir=/data/lists/; --dataorder=input; --decoderattnround=1; --decoderdropout=0; --decoderrnnlayer=1; --decodertype=wrd; --devwin=0; --emission_dir=; --emission_queue_size=3000; --enable_distributed=true; --encoderdim=0; --eosscore=0; --eostoken=false; --everstoredb=false; --fftcachesize=1; --filterbanks=80; --flagsfile=/data/scripts/decode.cfg; --framesizems=25; --framestridems=10; --gamma=1; --gumbeltemperature=1; --input=wav; --inputbinsize=100; --inputfeeding=false; --isbeamdump=false; --iter=100000000; --itersave=true; --labelsmooth=0; --leftWindowSize=50; --lexicon=/data/decoder-unigram-10000-nbest10.lexicon; --linlr=-1; --linlrcrit=-1; --linseg=0; --lm=/data/lm_04.bin; --lm_memory=5000; --lm_vocab=; --lmtype=kenlm; --lmweight=0.5515838301157; --localnrmlleftctx=300; --localnrmlrightctx=0; --logadd=false; --lr=0.01; --lr_decay=10000; --lr_decay_step=9223372036854775807; --lrcosine=false; --lrcrit=0; --max_devices_per_node=8; --maxdecoderoutputlen=200; --maxgradnorm=0.5; --maxisz=33000; --maxload=-1; --maxrate=10; --maxsil=50; --maxtsz=9223372036854775807; --maxword=-1; --melfloor=1; --memstepsize=10485760; --mfcc=false; --mfcccoeffs=13; --mfsc=true; --minisz=200; --minrate=3; --minsil=0; --mintsz=2; --momentum=0.80000000000000004; --netoptim=sgd; --noresample=false; --nthread=6; --nthread_decoder=8; --nthread_decoder_am_forward=1; --numattnhead=8; --onorm=target; --optimepsilon=1e-08; --optimrho=0.90000000000000002; --outputbinsize=5; --pctteacherforcing=100; --pcttraineval=1; --pow=false; --pretrainWindow=0; --replabel=0; --reportiters=1000; --rightWindowSize=50; --rndv_filepath=; --rundir=/data/02_04_2021; --runname=inference_2019; --samplerate=16000; --sampletarget=0; --samplingstrategy=rand; --saug_fmaskf=27; --saug_fmaskn=2; --saug_start_update=-1; --saug_tmaskn=2; --saug_tmaskp=1; --saug_tmaskt=100; --sclite=; --seed=0; --show=false; --showletters=false; --silscore=0.5; --smearing=max; --smoothingtemperature=1; --softwoffset=10; --softwrate=5; --softwstd=5; --sqnorm=true; --stepsize=1000000; --surround=; --tag=; --target=tkn; --test=test.lst; --tokens=librispeech-train-all-unigram-10000.tokens; --tokensdir=/data/; --train=lists/train.lst; --trainWithWindow=false; --transdiag=0; --unkscore=-inf; --usememcache=false; --uselexicon=true; --usewordpiece=true; --valid=lists/dev.lst; --validbatchsize=-1; --warmup=1; --weightdecay=0; --wordscore=0.52526055643809; --wordseparator=; --world_rank=0; --world_size=32; --alsologtoemail=; --alsologtostderr=false; --colorlogtostderr=false; --drop_log_memory=true; --log_backtrace_at=; --log_dir=; --log_link=; --log_prefix=true; --logbuflevel=0; --logbufsecs=30; --logemaillevel=999; --logfile_mode=436; --logmailer=/bin/mail; --logtostderr=true; --max_log_size=1800; --minloglevel=0; --stderrthreshold=2; --stop_logging_if_full_disk=false; --symbolize_stacktrace=true; --v=0; --vmodule=; I0209 12:34:58.999300 30329 Decode.cpp:127] Number of classes (network): 9998 I0209 12:34:59.957891 30329 Decode.cpp:134] Number of words: 204170 I0209 12:35:00.082792 30329 Decode.cpp:247] [Decoder] LM constructed. Aborted at 1612902900 (unix time) try "date -d @1612902900" if you are using GNU date PC: @ 0x5605cdad1571 lm::ngram::detail::GenericModel<>::ScoreExceptBackoff() SIGSEGV (@0x5605f6fdc000) received by PID 30329 (TID 0x7f30abdfc000) from PID 18446744073558409216; stack trace: @ 0x7f30a4147980 (unknown) @ 0x5605cdad1571 lm::ngram::detail::GenericModel<>::ScoreExceptBackoff() @ 0x5605cdad1775 lm::ngram::detail::GenericModel<>::FullScore() @ 0x5605cdad184a lm::base::ModelFacade<>::BaseScore() @ 0x5605cd9f4ed5 w2l::KenLM::score() @ 0x5605cd89e626 main @ 0x7f3061496bf7 __libc_start_main @ 0x5605cd90132a _start Segmentation fault (core dumped)

Platform and Hardware

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu version - 18.04 LTS Python version: Python 3.6.9 Bazel version (if compiling from source): N/A GCC/Compiler version (if compiling from source): N/A CUDA/cuDNN version: 10.1/7.6.4.38 GPU model and memory: NVIDIA-SMI 460.27.04 Driver Version: 460.27.04

vchagari commented 3 years ago

Recompiling the Wave2letter, Flashlight and ArrayFire solved the issue!.

archanaqre commented 9 months ago

@vchagari Can you give me an idea of the commands used for recompiling?

flashlight / wav2letter

Getting a coredump while running the decoder with the language model. #948

Bug Description

Platform and Hardware