flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.39k stars 1.01k forks source link

model conversion #610

Closed mohamad-hasan-sohan-ajini closed 4 years ago

mohamad-hasan-sohan-ajini commented 4 years ago

Hi I trained a character level model with the same arch file as streaming TDS. I tried to convert it to the inference format by: ./build/tools/streaming_tds_model_converter -am /data/tmp/mzl_streaming/003_model_lists#dev_norm_agg.csv.bin --outdir /data/tmp/convert and get the core dump as follows:

I0414 12:00:41.092471 40 StreamingTDSModelConverter.cpp:164] Gflags after parsing --flagfile=; --fromenv=; --tryfromenv=; --undefok=; --tab_completion_columns=80; --tab_completion_word=; --help=false; --helpfull=false; --helpmatch=; --helpon=; --helppackage=false; --helpshort=false; --helpxml=false; --version=false; --adambeta1=0.90000000000000002; --adambeta2=0.999; --am=/data/tmp/mzl_streaming/003_model_lists#dev_norm_agg.csv.bin; --am_decoder_tr_dropout=0; --am_decoder_tr_layerdrop=0; --am_decoder_tr_layers=1; --arch=am_500ms_future_context.arch; --archdir=/root/wav2letter/tutorials/mzl_streaming; --attention=content; --attentionthreshold=2147483647; --attnWindow=no; --attnconvchannel=0; --attnconvkernel=0; --attndim=0; --batchsize=8; --beamsize=2500; --beamsizetoken=250000; --beamthreshold=25; --blobdata=false; --channels=1; --criterion=ctc; --critoptim=sgd; --datadir=/data/mozilla/agg; --dataorder=input; --decoderattnround=1; --decoderdropout=0; --decoderrnnlayer=1; --decodertype=wrd; --devwin=0; --emission_dir=; --emission_queue_size=3000; --enable_distributed=false; --encoderdim=0; --eosscore=0; --eostoken=false; --everstoredb=false; --fftcachesize=1; --filterbanks=80; --flagsfile=tutorials/mzl_streaming/train_am_500ms_future_context.cfg; --framesizems=25; --framestridems=10; --gamma=1; --gumbeltemperature=1; --input=flac; --inputbinsize=100; --inputfeeding=false; --isbeamdump=false; --iter=1000000; --itersave=false; --labelsmooth=0; --leftWindowSize=50; --lexicon=/data/mozilla/agg/am/lexicon_agg.txt; --linlr=-1; --linlrcrit=-1; --linseg=0; --lm=; --lm_memory=5000; --lm_vocab=; --lmtype=kenlm; --lmweight=0; --localnrmlleftctx=300; --localnrmlrightctx=0; --logadd=false; --lr=0.29999999999999999; --lr_decay=9223372036854775807; --lr_decay_step=9223372036854775807; --lrcosine=false; --lrcrit=0; --maxdecoderoutputlen=200; --maxgradnorm=0.5; --maxisz=9223372036854775807; --maxload=-1; --maxrate=10; --maxsil=50; --maxtsz=9223372036854775807; --maxword=-1; --melfloor=1; --memstepsize=10485760; --mfcc=false; --mfcccoeffs=13; --mfsc=true; --minisz=0; --minrate=3; --minsil=0; --mintsz=0; --momentum=0; --netoptim=sgd; --noresample=false; --nthread=10; --nthread_decoder=1; --nthread_decoder_am_forward=1; --numattnhead=8; --onorm=target; --optimepsilon=1e-08; --optimrho=0.90000000000000002; --outputbinsize=5; --pctteacherforcing=100; --pcttraineval=1; --pow=false; --pretrainWindow=0; --replabel=0; --reportiters=1000; --rightWindowSize=50; --rndv_filepath=; --rundir=/root/wav2letter/runs; --runname=mzl_streaming; --samplerate=16000; --sampletarget=0; --samplingstrategy=rand; --saug_fmaskf=27; --saug_fmaskn=2; --saug_start_update=-1; --saug_tmaskn=2; --saug_tmaskp=1; --saug_tmaskt=100; --sclite=; --seed=0; --show=false; --showletters=false; --silscore=0; --smearing=none; --smoothingtemperature=1; --softwoffset=10; --softwrate=5; --softwstd=5; --sqnorm=true; --stepsize=1000000; --surround=|; --tag=; --target=tkn; --test=; --tokens=/data/mozilla/agg/am/tokens.txt; --tokensdir=; --train=lists/train_norm_agg.csv; --trainWithWindow=false; --transdiag=0; --unkscore=-inf; --use_memcache=false; --use_saug=false; --uselexicon=true; --usewordpiece=false; --valid=lists/dev_norm_agg.csv; --warmup=8000; --weightdecay=0; --wordscore=0; --wordseparator=|; --world_rank=0; --world_size=1; --outdir=/data/tmp/convert; --alsologtoemail=; --alsologtostderr=false; --colorlogtostderr=false; --drop_log_memory=true; --log_backtrace_at=; --log_dir=; --log_link=; --log_prefix=true; --logbuflevel=0; --logbufsecs=30; --logemaillevel=999; --logmailer=/bin/mail; --logtostderr=true; --max_log_size=1800; --minloglevel=0; --stderrthreshold=2; --stop_logging_if_full_disk=false; --symbolize_stacktrace=true; --v=0; --vmodule=; I0414 12:00:41.092607 40 StreamingTDSModelConverter.cpp:179] Number of classes (network): 40 Skipping View module: V -1 NFEAT 1 0 Skipping SpecAugment module: SAUG 80 27 2 100 1.0 2 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Reorder module: RO 2 1 0 3 Skipping View module: V 2160 -1 1 0 Skipping View module: V NLABEL 0 -1 1 Aborted at 1586865644 (unix time) try "date -d @1586865644" if you are using GNU date PC: @ 0x419e6b main SIGSEGV (@0x0) received by PID 40 (TID 0x7f9457044600) from PID 0; stack trace: @ 0x7f9414e22390 (unknown) @ 0x419e6b main @ 0x7f940d1bc830 __libc_start_main @ 0x496019 _start @ 0x0 (unknown) Segmentation fault (core dumped)

Any suggestion is appreciated.

vineelpratap commented 4 years ago

Hi, could you also provide gdb backtrace

mohamad-hasan-sohan-ajini commented 4 years ago

Here is the backtrace output:

root@2a0c408a2a35:~/wav2letter# gdb ./build/tools/streaming_tds_model_converter GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./build/tools/streaming_tds_model_converter...(no debugging symbols found)...done. (gdb) run -am /data/tmp/mzl_streaming/003_model_lists#dev_norm_agg.csv.bin --outdir /data/tmp/convert Starting program: /root/wav2letter/build/tools/streaming_tds_model_converter -am /data/tmp/mzl_streaming/003_model_lists#dev_norm_agg.csv.bin --outdir /data/tmp/convert warning: Error disabling address space randomization: Operation not permitted [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fb8dfdec700 (LWP 388)] [New Thread 0x7fb8df5eb700 (LWP 389)] [New Thread 0x7fb8ded69700 (LWP 390)] I0415 06:13:21.690006 384 StreamingTDSModelConverter.cpp:164] Gflags after parsing --flagfile=; --fromenv=; --tryfromenv=; --undefok=; --tab_completion_columns=80; --tab_completion_word=; --help=false; --helpfull=false; --helpmatch=; --helpon=; --helppackage=false; --helpshort=false; --helpxml=false; --version=false; --adambeta1=0.90000000000000002; --adambeta2=0.999; --am=/data/tmp/mzl_streaming/003_model_lists#dev_norm_agg.csv.bin; --am_decoder_tr_dropout=0; --am_decoder_tr_layerdrop=0; --am_decoder_tr_layers=1; --arch=am_500ms_future_context.arch; --archdir=/root/wav2letter/tutorials/mzl_streaming; --attention=content; --attentionthreshold=2147483647; --attnWindow=no; --attnconvchannel=0; --attnconvkernel=0; --attndim=0; --batchsize=8; --beamsize=2500; --beamsizetoken=250000; --beamthreshold=25; --blobdata=false; --channels=1; --criterion=ctc; --critoptim=sgd; --datadir=/data/mozilla/agg; --dataorder=input; --decoderattnround=1; --decoderdropout=0; --decoderrnnlayer=1; --decodertype=wrd; --devwin=0; --emission_dir=; --emission_queue_size=3000; --enable_distributed=false; --encoderdim=0; --eosscore=0; --eostoken=false; --everstoredb=false; --fftcachesize=1; --filterbanks=80; --flagsfile=tutorials/mzl_streaming/train_am_500ms_future_context.cfg; --framesizems=25; --framestridems=10; --gamma=1; --gumbeltemperature=1; --input=flac; --inputbinsize=100; --inputfeeding=false; --isbeamdump=false; --iter=1000000; --itersave=false; --labelsmooth=0; --leftWindowSize=50; --lexicon=/data/mozilla/agg/am/lexicon_agg.txt; --linlr=-1; --linlrcrit=-1; --linseg=0; --lm=; --lm_memory=5000; --lm_vocab=; --lmtype=kenlm; --lmweight=0; --localnrmlleftctx=300; --localnrmlrightctx=0; --logadd=false; --lr=0.29999999999999999; --lr_decay=9223372036854775807; --lr_decay_step=9223372036854775807; --lrcosine=false; --lrcrit=0; --maxdecoderoutputlen=200; --maxgradnorm=0.5; --maxisz=9223372036854775807; --maxload=-1; --maxrate=10; --maxsil=50; --maxtsz=9223372036854775807; --maxword=-1; --melfloor=1; --memstepsize=10485760; --mfcc=false; --mfcccoeffs=13; --mfsc=true; --minisz=0; --minrate=3; --minsil=0; --mintsz=0; --momentum=0; --netoptim=sgd; --noresample=false; --nthread=10; --nthread_decoder=1; --nthread_decoder_am_forward=1; --numattnhead=8; --onorm=target; --optimepsilon=1e-08; --optimrho=0.90000000000000002; --outputbinsize=5; --pctteacherforcing=100; --pcttraineval=1; --pow=false; --pretrainWindow=0; --replabel=0; --reportiters=1000; --rightWindowSize=50; --rndv_filepath=; --rundir=/root/wav2letter/runs; --runname=mzl_streaming; --samplerate=16000; --sampletarget=0; --samplingstrategy=rand; --saug_fmaskf=27; --saug_fmaskn=2; --saug_start_update=-1; --saug_tmaskn=2; --saug_tmaskp=1; --saug_tmaskt=100; --sclite=; --seed=0; --show=false; --showletters=false; --silscore=0; --smearing=none; --smoothingtemperature=1; --softwoffset=10; --softwrate=5; --softwstd=5; --sqnorm=true; --stepsize=1000000; --surround=|; --tag=; --target=tkn; --test=; --tokens=/data/mozilla/agg/am/tokens.txt; --tokensdir=; --train=lists/train_norm_agg.csv; --trainWithWindow=false; --transdiag=0; --unkscore=-inf; --use_memcache=false; --use_saug=false; --uselexicon=true; --usewordpiece=false; --valid=lists/dev_norm_agg.csv; --warmup=8000; --weightdecay=0; --wordscore=0; --wordseparator=|; --world_rank=0; --world_size=1; --outdir=/data/tmp/convert; --alsologtoemail=; --alsologtostderr=false; --colorlogtostderr=false; --drop_log_memory=true; --log_backtrace_at=; --log_dir=; --log_link=; --log_prefix=true; --logbuflevel=0; --logbufsecs=30; --logemaillevel=999; --logmailer=/bin/mail; --logtostderr=true; --max_log_size=1800; --minloglevel=0; --stderrthreshold=2; --stop_logging_if_full_disk=false; --symbolize_stacktrace=true; --v=0; --vmodule=; I0415 06:13:21.690137 384 StreamingTDSModelConverter.cpp:179] Number of classes (network): 40 Skipping View module: V -1 NFEAT 1 0 Skipping SpecAugment module: SAUG 80 27 2 100 1.0 2 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Dropout module: DO 0.1 Skipping Reorder module: RO 2 1 0 3 Skipping View module: V 2160 -1 1 0 Skipping View module: V NLABEL 0 -1 1 Thread 1 "streaming_tds_m" received signal SIGSEGV, Segmentation fault. 0x0000000000419e6b in main () (gdb) bt

0 0x0000000000419e6b in main ()

(gdb) thread apply all bt Thread 4 (Thread 0x7fb8ded69700 (LWP 390)):

0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225

1 0x00007fb8e7b00227 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

2 0x00007fb8e7a9eab7 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

3 0x00007fb8e7aff4a8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

4 0x00007fb906b076ba in start_thread (arg=0x7fb8ded69700) at pthread_create.c:333

5 0x00007fb8fef9241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7fb8df5eb700 (LWP 389)):

0 0x00007fb8fef8674d in poll () at ../sysdeps/unix/syscall-template.S:84

1 0x00007fb8e7afce23 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

2 0x00007fb8e7ba7c3a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

3 0x00007fb8e7aff4a8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

4 0x00007fb906b076ba in start_thread (arg=0x7fb8df5eb700) at pthread_create.c:333

5 0x00007fb8fef9241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7fb8dfdec700 (LWP 388)):

0 0x00007fb8fef938c8 in accept4 (fd=10, addr=..., addr_len=0x7fb8dfdddf58, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:40

1 0x00007fb8e7afddea in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

2 0x00007fb8e7aef68d in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

3 0x00007fb8e7aff4a8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1

4 0x00007fb906b076ba in start_thread (arg=0x7fb8dfdec700) at pthread_create.c:333

5 0x00007fb8fef9241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7fb948d33600 (LWP 384)):

0 0x0000000000419e6b in main ()

`

As I'm not a cpp developer, please inform me other useful gdb commands which will provide helpful information.

Thanks for your response.

lunixbochs commented 4 years ago
x/i $rip
i r

Also if you build wav2letter in debug instead of release mode (cmake) it may end up with symbols that will improve the backtrace.

mohamad-hasan-sohan-ajini commented 4 years ago

Compiling the latest commit works fine and will convert the model without error.