flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

W2lListFilesDataset.cpp:105] Could not read file '' #897

Closed ML6634 closed 3 years ago

ML6634 commented 3 years ago

I am running Resnet CTC training using user's audios. I am using a very small number of audios for testing for now. I have got an bug:

Could not read file ''

Any ideas about that? Thank you!

root@cc3bfdb33570:~# wav2letter/build/Train train --flagsfile wav2letter/recipes/models/sota/2019/librispeech/train_am_resnet_ctc.cfg --minloglevel=0 --logtostderr=1
I1123 01:48:15.528980    73 Train.cpp:59] Reading flags from file wav2letter/recipes/models/sota/2019/librispeech/train_am_resnet_ctc.cfg
Initialized NCCL 2.4.8 successfully!
I1123 01:48:15.820979    73 Train.cpp:151] Gflags after parsing 
--flagfile=; --fromenv=; --tryfromenv=; --undefok=; --tab_completion_columns=80; --tab_completion_word=; --help=false; --helpfull=false; --helpmatch=; --helpon=; --helppackage=false; --helpshort=false; --helpxml=false; --version=false; --adambeta1=0.90000000000000002; --adambeta2=0.999; --am=; --am_decoder_tr_dropout=0; --am_decoder_tr_layerdrop=0; --am_decoder_tr_layers=1; --arch=am_arch/am_resnet_ctc.arch; --archdir=/root/wav2letter/recipes/models/sota/2019; --attention=content; --attentionthreshold=2147483647; --attnWindow=no; --attnconvchannel=0; --attnconvkernel=0; --attndim=0; --batchsize=4; --beamsize=2500; --beamsizetoken=250000; --beamthreshold=25; --blobdata=false; --channels=1; --criterion=ctc; --critoptim=sgd; --datadir=; --dataorder=input; --decoderattnround=1; --decoderdropout=0; --decoderrnnlayer=1; --decodertype=wrd; --devwin=0; --emission_dir=; --emission_queue_size=3000; --enable_distributed=true; --encoderdim=0; --eosscore=0; --eostoken=false; --everstoredb=false; --fftcachesize=1; --filterbanks=80; --flagsfile=wav2letter/recipes/models/sota/2019/librispeech/train_am_resnet_ctc.cfg; --framesizems=25; --framestridems=10; --gamma=1; --gumbeltemperature=1; --input=wav; --inputbinsize=100; --inputfeeding=false; --isbeamdump=false; --iter=10000; --itersave=false; --labelsmooth=0.050000000000000003; --leftWindowSize=50; --lexicon=/root/w2l/am/librispeech-train+dev-unigram-10000-nbest10.lexicon; --linlr=-1; --linlrcrit=-1; --linseg=0; --lm=; --lm_memory=5000; --lm_vocab=; --lmtype=kenlm; --lmweight=0; --localnrmlleftctx=0; --localnrmlrightctx=0; --logadd=false; --lr=0.40000000000000002; --lr_decay=10000; --lr_decay_step=9223372036854775807; --lrcosine=true; --lrcrit=0; --max_devices_per_node=8; --maxdecoderoutputlen=200; --maxgradnorm=1; --maxisz=9223372036854775807; --maxload=-1; --maxrate=10; --maxsil=50; --maxtsz=9223372036854775807; --maxword=-1; --melfloor=1; --memstepsize=10485760; --mfcc=false; --mfcccoeffs=13; --mfsc=true; --minisz=200; --minrate=3; --minsil=0; --mintsz=2; --momentum=0.59999999999999998; --netoptim=sgd; --noresample=false; --nthread=4; --nthread_decoder=1; --nthread_decoder_am_forward=1; --numattnhead=8; --onorm=target; --optimepsilon=1e-08; --optimrho=0.90000000000000002; --outputbinsize=5; --pctteacherforcing=100; --pcttraineval=100; --pow=false; --pretrainWindow=0; --replabel=0; --reportiters=2000; --rightWindowSize=50; --rndv_filepath=; --rundir=/root/w2l/saved_models; --runname=am_resnet_ctc_librispeech; --samplerate=16000; --sampletarget=0.01; --samplingstrategy=rand; --saug_fmaskf=27; --saug_fmaskn=2; --saug_start_update=-1; --saug_tmaskn=2; --saug_tmaskp=1; --saug_tmaskt=100; --sclite=; --seed=0; --show=false; --showletters=false; --silscore=0; --smearing=none; --smoothingtemperature=1; --softwoffset=10; --softwrate=5; --softwstd=5; --sqnorm=true; --stepsize=9223372036854775807; --surround=; --tag=; --target=tkn; --test=; --tokens=librispeech-train-all-unigram-10000.tokens; --tokensdir=/root/w2l/am; --train=/root/w2l/lists/TmList_11-2020.lst; --trainWithWindow=false; --transdiag=0; --unkscore=-inf; --use_memcache=false; --uselexicon=true; --usewordpiece=true; --valid=; --validbatchsize=-1; --warmup=1; --weightdecay=0; --wordscore=0; --wordseparator=_; --world_rank=0; --world_size=1; --alsologtoemail=; --alsologtostderr=false; --colorlogtostderr=false; --drop_log_memory=true; --log_backtrace_at=; --log_dir=; --log_link=; --log_prefix=true; --logbuflevel=0; --logbufsecs=30; --logemaillevel=999; --logfile_mode=436; --logmailer=/bin/mail; --logtostderr=true; --max_log_size=1800; --minloglevel=0; --stderrthreshold=2; --stop_logging_if_full_disk=false; --symbolize_stacktrace=true; --v=0; --vmodule=; 
I1123 01:48:15.821269    73 Train.cpp:152] Experiment path: /root/w2l/saved_models/am_resnet_ctc_librispeech
I1123 01:48:15.821274    73 Train.cpp:153] Experiment runidx: 1
I1123 01:48:15.824019    73 Train.cpp:199] Number of classes (network): 9998
I1123 01:48:16.292203    73 Train.cpp:206] Number of words: 89612
I1123 01:48:16.315507    73 Train.cpp:220] Loading architecture file from /root/wav2letter/recipes/models/sota/2019/am_arch/am_resnet_ctc.arch
I1123 01:48:16.432730    73 Train.cpp:252] [Network] Sequential [input -> (0) -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> (47) -> (48) -> (49) -> (50) -> (51) -> (52) -> (53) -> (54) -> (55) -> (56) -> (57) -> (58) -> (59) -> (60) -> (61) -> (62) -> (63) -> (64) -> (65) -> (66) -> (67) -> (68) -> (69) -> (70) -> output]
    (0): SpecAugment ( W: 80, F: 27, mF: 2, T: 100, p: 1, mT: 2 )
    (1): View (-1 1 80 0)
    (2): Conv2D (80->1024, 3x1, 2,1, SAME,0, 1, 1) (with bias)
    (3): ReLU
    (4): Dropout (0.150000)
    (5): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (6): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (7): ReLU
    (8): Dropout (0.150000)
    (9): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (10): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (11): ReLU
    (12): Dropout (0.150000)
    (13): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (14): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (15): ReLU
    (16): Dropout (0.150000)
    (17): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (18): Pool2D-max (2x1, 2,1, 0,0)
    (19): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (20): ReLU
    (21): Dropout (0.150000)
    (22): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (23): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (24): ReLU
    (25): Dropout (0.150000)
    (26): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (27): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (28): ReLU
    (29): Dropout (0.150000)
    (30): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (31): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (32): ReLU
    (33): Dropout (0.150000)
    (34): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (35): Pool2D-max (2x1, 2,1, 0,0)
    (36): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.150000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.150000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (1024->1024, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (37): ReLU
    (38): Dropout (0.150000)
    (39): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (40): Conv2D (1024->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    (41): ReLU
    (42): Dropout (0.150000)
    (43): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (44): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.200000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.200000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (45): ReLU
    (46): Dropout (0.200000)
    (47): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (48): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.200000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.200000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (2048->2048, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (49): ReLU
    (50): Dropout (0.200000)
    (51): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (52): Conv2D (2048->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    (53): ReLU
    (54): Dropout (0.200000)
    (55): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (56): Pool2D-max (2x1, 2,1, 0,0)
    (57): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.250000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.250000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (58): ReLU
    (59): Dropout (0.250000)
    (60): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (61): 
    Res(0): Input; skip connection to output 
    Res(1): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(2): ReLU
    Res(3): Dropout (0.250000)
    Res(4): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(5): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    Res(6): ReLU
    Res(7): Dropout (0.250000)
    Res(8): LayerNorm ( axis : { 0 1 2 } , size : -1)
    Res(9): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias) with scale (before layer is applied) 0.70711;
    Res(10): Output;
    (62): ReLU
    (63): Dropout (0.250000)
    (64): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (65): Conv2D (2304->2304, 3x1, 1,1, SAME,0, 1, 1) (with bias)
    (66): ReLU
    (67): LayerNorm ( axis : { 0 1 2 } , size : -1)
    (68): Dropout (0.250000)
    (69): Conv2D (2304->9998, 1x1, 1,1, SAME,0, 1, 1) (with bias)
    (70): Reorder (2,0,3,1)
I1123 01:48:16.433024    73 Train.cpp:253] [Network Params: 306268510]
I1123 01:48:16.433035    73 Train.cpp:254] [Criterion] ConnectionistTemporalClassificationCriterion
I1123 01:48:16.531312    73 Train.cpp:262] [Network Optimizer] SGD (momentum=0.6)
I1123 01:48:16.531330    73 Train.cpp:263] [Criterion Optimizer] SGD
Falling back to using letters as targets for the unknown word: babysitter
Falling back to using letters as targets for the unknown word: thi
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: menash
Falling back to using letters as targets for the unknown word: wiz
Falling back to using letters as targets for the unknown word: menash
Falling back to using letters as targets for the unknown word: video
Falling back to using letters as targets for the unknown word: michelle
Falling back to using letters as targets for the unknown word: michelle's
Falling back to using letters as targets for the unknown word: caldor's
Falling back to using letters as targets for the unknown word: woolworth's
Falling back to using letters as targets for the unknown word: woolworth's
Falling back to using letters as targets for the unknown word: woolworth's
Falling back to using letters as targets for the unknown word: cheapie
Falling back to using letters as targets for the unknown word: markdown
Falling back to using letters as targets for the unknown word: mommy
Falling back to using letters as targets for the unknown word: tel
Falling back to using letters as targets for the unknown word: aviv
Falling back to using letters as targets for the unknown word: menash
Falling back to using letters as targets for the unknown word: tel
Falling back to using letters as targets for the unknown word: aviv
Falling back to using letters as targets for the unknown word: rishon
Falling back to using letters as targets for the unknown word: superland
Falling back to using letters as targets for the unknown word: talya
Falling back to using letters as targets for the unknown word: wucha
Falling back to using letters as targets for the unknown word: shera
Falling back to using letters as targets for the unknown word: tushie
Falling back to using letters as targets for the unknown word: tushie
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: michelle
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: holnick
Falling back to using letters as targets for the unknown word: mommy
Falling back to using letters as targets for the unknown word: lisa's
Falling back to using letters as targets for the unknown word: kno
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: setups
Falling back to using letters as targets for the unknown word: mommy
Falling back to using letters as targets for the unknown word: mommy
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: enou
Falling back to using letters as targets for the unknown word: drexel
Falling back to using letters as targets for the unknown word: bic
Falling back to using letters as targets for the unknown word: perf
Falling back to using letters as targets for the unknown word: expens
Falling back to using letters as targets for the unknown word: inexpensively
Falling back to using letters as targets for the unknown word: inexpensively
Falling back to using letters as targets for the unknown word: kiara
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: waterworld
Falling back to using letters as targets for the unknown word: blockbuster
Falling back to using letters as targets for the unknown word: thi
Falling back to using letters as targets for the unknown word: costner's
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: gritti
Falling back to using letters as targets for the unknown word: gritti
Falling back to using letters as targets for the unknown word: unders
Falling back to using letters as targets for the unknown word: alri
Falling back to using letters as targets for the unknown word: gritti's
Falling back to using letters as targets for the unknown word: longstemmed
Falling back to using letters as targets for the unknown word: longstemmed
Falling back to using letters as targets for the unknown word: gym's
Falling back to using letters as targets for the unknown word: cardio
Falling back to using letters as targets for the unknown word: cardio
Falling back to using letters as targets for the unknown word: bally's
Falling back to using letters as targets for the unknown word: weeknights
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: manachi's
Falling back to using letters as targets for the unknown word: lorraine's
Falling back to using letters as targets for the unknown word: pissy
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: goofy
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: atrium
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: laser
Falling back to using letters as targets for the unknown word: laser
Falling back to using letters as targets for the unknown word: wii
Falling back to using letters as targets for the unknown word: cv
Falling back to using letters as targets for the unknown word: offboard
Falling back to using letters as targets for the unknown word: amplifier
Falling back to using letters as targets for the unknown word: nah
Falling back to using letters as targets for the unknown word: redistricting
Falling back to using letters as targets for the unknown word: buzzwords
Falling back to using letters as targets for the unknown word: restructuring
Falling back to using letters as targets for the unknown word: expertise
Falling back to using letters as targets for the unknown word: nasa
Falling back to using letters as targets for the unknown word: email
Falling back to using letters as targets for the unknown word: siberians
Falling back to using letters as targets for the unknown word: carribean
Falling back to using letters as targets for the unknown word: nasa
Falling back to using letters as targets for the unknown word: branow
Falling back to using letters as targets for the unknown word: freeway
Falling back to using letters as targets for the unknown word: gainesville
Falling back to using letters as targets for the unknown word: universi
Falling back to using letters as targets for the unknown word: branow
Falling back to using letters as targets for the unknown word: commuting
Falling back to using letters as targets for the unknown word: commuting
Falling back to using letters as targets for the unknown word: I
Skipping unknown token 'I' when falling back to letter target for the unknown word: I
Falling back to using letters as targets for the unknown word: you,
Skipping unknown token ',' when falling back to letter target for the unknown word: you,
Falling back to using letters as targets for the unknown word: mentor,
Skipping unknown token ',' when falling back to letter target for the unknown word: mentor,
Falling back to using letters as targets for the unknown word: sayonara
Falling back to using letters as targets for the unknown word: kari
Falling back to using letters as targets for the unknown word: kari
Falling back to using letters as targets for the unknown word: buddies
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: lipstien
Falling back to using letters as targets for the unknown word: krulac
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: krulac
Falling back to using letters as targets for the unknown word: taping
Falling back to using letters as targets for the unknown word: berens
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: jennifer
Falling back to using letters as targets for the unknown word: redeye
Falling back to using letters as targets for the unknown word: syra
Falling back to using letters as targets for the unknown word: ouchies
Falling back to using letters as targets for the unknown word: ouchies
Falling back to using letters as targets for the unknown word: ouchies
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: doghouse
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: revisionists
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: arri
Falling back to using letters as targets for the unknown word: kingpin
Falling back to using letters as targets for the unknown word: furobiashi
Falling back to using letters as targets for the unknown word: motivated
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: unmotivated
Falling back to using letters as targets for the unknown word: hiroki
Falling back to using letters as targets for the unknown word: minimal
Falling back to using letters as targets for the unknown word: minimal
Falling back to using letters as targets for the unknown word: unmotivated
Falling back to using letters as targets for the unknown word: hiroki
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: apgar
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: escapees
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: psych
Falling back to using letters as targets for the unknown word: soc
Falling back to using letters as targets for the unknown word: socio
Falling back to using letters as targets for the unknown word: psych
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: shanley
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mandatory
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: bummed
Falling back to using letters as targets for the unknown word: shou
Falling back to using letters as targets for the unknown word: comp
Falling back to using letters as targets for the unknown word: busting
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: flanner
Falling back to using letters as targets for the unknown word: o'connery
Falling back to using letters as targets for the unknown word: psych
Falling back to using letters as targets for the unknown word: soc
Falling back to using letters as targets for the unknown word: midterm
Falling back to using letters as targets for the unknown word: soc
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: comp
Falling back to using letters as targets for the unknown word: sch
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: thumper
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: email
Falling back to using letters as targets for the unknown word: amnio
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: ultrasound
Falling back to using letters as targets for the unknown word: positives
Falling back to using letters as targets for the unknown word: syndrome
Falling back to using letters as targets for the unknown word: amnio
Falling back to using letters as targets for the unknown word: aborted
Falling back to using letters as targets for the unknown word: termina
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: ok
Falling back to using letters as targets for the unknown word: humongous
Falling back to using letters as targets for the unknown word: free!
Skipping unknown token '!' when falling back to letter target for the unknown word: free!
Falling back to using letters as targets for the unknown word: crosswords
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: etta
Falling back to using letters as targets for the unknown word: duh
Falling back to using letters as targets for the unknown word: duh
Falling back to using letters as targets for the unknown word: buh
Falling back to using letters as targets for the unknown word: buh
Falling back to using letters as targets for the unknown word: duh
Falling back to using letters as targets for the unknown word: duh
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: groovy
Falling back to using letters as targets for the unknown word: anya
Falling back to using letters as targets for the unknown word: echte
Falling back to using letters as targets for the unknown word: mil
Falling back to using letters as targets for the unknown word: expatriot
Falling back to using letters as targets for the unknown word: expatriot
Falling back to using letters as targets for the unknown word: deductible
Falling back to using letters as targets for the unknown word: problem's
Falling back to using letters as targets for the unknown word: problem's
Falling back to using letters as targets for the unknown word: anya's
Falling back to using letters as targets for the unknown word: deductible
Falling back to using letters as targets for the unknown word: anya
Falling back to using letters as targets for the unknown word: anya
Falling back to using letters as targets for the unknown word: kans
Falling back to using letters as targets for the unknown word: anya's
Falling back to using letters as targets for the unknown word: anya's
Falling back to using letters as targets for the unknown word: email
Falling back to using letters as targets for the unknown word: email
Falling back to using letters as targets for the unknown word: email
Falling back to using letters as targets for the unknown word: anya's
Falling back to using letters as targets for the unknown word: mifrau
Falling back to using letters as targets for the unknown word: mifrau
Falling back to using letters as targets for the unknown word: mifrau
Falling back to using letters as targets for the unknown word: nah
Falling back to using letters as targets for the unknown word: amer
Falling back to using letters as targets for the unknown word: iffy
Falling back to using letters as targets for the unknown word: kevin
Falling back to using letters as targets for the unknown word: stu
Falling back to using letters as targets for the unknown word: tetons
Falling back to using letters as targets for the unknown word: benneton
Falling back to using letters as targets for the unknown word: yuppie
Falling back to using letters as targets for the unknown word: bebe
Falling back to using letters as targets for the unknown word: skydiving
Falling back to using letters as targets for the unknown word: skydiving
Falling back to using letters as targets for the unknown word: mini
Falling back to using letters as targets for the unknown word: diaper
Falling back to using letters as targets for the unknown word: disposables
Falling back to using letters as targets for the unknown word: doo's
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: ooh
Falling back to using letters as targets for the unknown word: hysterectomies
Falling back to using letters as targets for the unknown word: gyn
Falling back to using letters as targets for the unknown word: gyn
Falling back to using letters as targets for the unknown word: staticky
Falling back to using letters as targets for the unknown word: staticky
Falling back to using letters as targets for the unknown word: marchish
Falling back to using letters as targets for the unknown word: mayb
Falling back to using letters as targets for the unknown word: middlebury
Falling back to using letters as targets for the unknown word: god!
Skipping unknown token '!' when falling back to letter target for the unknown word: god!
Falling back to using letters as targets for the unknown word: sunbathe
Falling back to using letters as targets for the unknown word: sightsee
Falling back to using letters as targets for the unknown word: brits
Falling back to using letters as targets for the unknown word: ascher
Falling back to using letters as targets for the unknown word: ascher
Falling back to using letters as targets for the unknown word: god!
Skipping unknown token '!' when falling back to letter target for the unknown word: god!
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: happ
Falling back to using letters as targets for the unknown word: hmm
Falling back to using letters as targets for the unknown word: hmm
Falling back to using letters as targets for the unknown word: mhmm
Falling back to using letters as targets for the unknown word: mhmm
Falling back to using letters as targets for the unknown word: tuftees
Falling back to using letters as targets for the unknown word: ron
Falling back to using letters as targets for the unknown word: gleason
Falling back to using letters as targets for the unknown word: ron
Falling back to using letters as targets for the unknown word: ucsd
Falling back to using letters as targets for the unknown word: letdown
Falling back to using letters as targets for the unknown word: umm
Falling back to using letters as targets for the unknown word: ron
Falling back to using letters as targets for the unknown word: meryl
Falling back to using letters as targets for the unknown word: diagnosing
Falling back to using letters as targets for the unknown word: ron
Falling back to using letters as targets for the unknown word: ron's
Falling back to using letters as targets for the unknown word: ron
Falling back to using letters as targets for the unknown word: academics
Falling back to using letters as targets for the unknown word: gana
Falling back to using letters as targets for the unknown word: ginny
Falling back to using letters as targets for the unknown word: totta
Falling back to using letters as targets for the unknown word: dougo
Falling back to using letters as targets for the unknown word: dougo's
Falling back to using letters as targets for the unknown word: motel
Falling back to using letters as targets for the unknown word: mov
Falling back to using letters as targets for the unknown word: bullshit
Falling back to using letters as targets for the unknown word: no!
Skipping unknown token '!' when falling back to letter target for the unknown word: no!
Falling back to using letters as targets for the unknown word: crap
Falling back to using letters as targets for the unknown word: dougo
Falling back to using letters as targets for the unknown word: blah
Falling back to using letters as targets for the unknown word: switcheroony
Falling back to using letters as targets for the unknown word: motel
Falling back to using letters as targets for the unknown word: repaint
Falling back to using letters as targets for the unknown word: erzhebat's
Falling back to using letters as targets for the unknown word: reme
Falling back to using letters as targets for the unknown word: yerushalaim
Falling back to using letters as targets for the unknown word: kibbutz
Falling back to using letters as targets for the unknown word: tzavah
Falling back to using letters as targets for the unknown word: oy
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: liat
Falling back to using letters as targets for the unknown word: jill's
Falling back to using letters as targets for the unknown word: farting
Falling back to using letters as targets for the unknown word: bo's
Falling back to using letters as targets for the unknown word: jill's
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: would've
Falling back to using letters as targets for the unknown word: fucking
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: pissed
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: shit
Falling back to using letters as targets for the unknown word: mhm
I1123 01:48:16.676594    73 W2lListFilesDataset.cpp:141] 20 files found. 
I1123 01:48:16.676610    73 Utils.cpp:102] Filtered 0/20 samples
I1123 01:48:16.676621    73 W2lListFilesDataset.cpp:62] Total batches (i.e. iters): 5
F1123 01:48:16.817703    73 W2lListFilesDataset.cpp:105] Could not read file ''
*** Check failure stack trace: ***
    @     0x7f4e444b10cd  google::LogMessage::Fail()
    @     0x7f4e444b2f33  google::LogMessage::SendToLog()
    @     0x7f4e444b0c28  google::LogMessage::Flush()
    @     0x7f4e444b3999  google::LogMessageFatal::~LogMessageFatal()
    @     0x55d69526e71b  w2l::W2lListFilesDataset::loadListFile()
    @     0x55d69526f27d  w2l::W2lListFilesDataset::W2lListFilesDataset()
    @     0x55d69528fa1e  w2l::createDataset()
    @     0x55d695018e67  main
    @     0x7f4e43796b97  __libc_start_main
    @     0x55d69507fe4a  _start
Aborted (core dumped)
abhinavkulkarni commented 3 years ago

@ML6634: Your --valid flag is empty. Should be something like:

--valid=dev-clean:/home/w2luser/w2l/lists/dev-clean.lst,dev-other:/home/w2luser/w2l/lists/dev-other.lst
ML6634 commented 3 years ago

Thank @abhinavkulkarni for the helpful comment, which has taken care of the issue!

(1) After that, I got:

'ArrayFire Exception (Device out of memory:101)

On my computer, for the default

--train=/root/w2l/lists/train-clean-100.lst,/root/w2l/lists/train-clean-360.lst,/root/w2l/lists/train-other-500.lst

the training went through. Now I have replaced it by 20 telephony audios, each of which is around 10 minutes long. I think the training data is smaller. Why does it cause

'ArrayFire Exception (Device out of memory:101)

?

(2) If I reduce batchsize to 1, the "out of memory" issue is gone. However, I have got:

Falling back to using letters as targets for the unknown word: mhm
Falling back to using letters as targets for the unknown word: mhm
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss, stat = label length >639 is not supported
*** Aborted at 1606106683 (unix time) try "date -d @1606106683" if you are using GNU date ***
PC: @     0x7f346b24ce97 gsignal
*** SIGABRT (@0xd6) received by PID 214 (TID 0x7f34b0b1e380) from PID 214; stack trace: ***
    @     0x7f34a8e32890 (unknown)
    @     0x7f346b24ce97 gsignal
    @     0x7f346b24e801 abort
    @     0x7f346bc41957 (unknown)
    @     0x7f346bc47ab6 (unknown)
    @     0x7f346bc47af1 std::terminate()
    @     0x7f346bc47d24 __cxa_throw
    @     0x563f69b3681f w2l::(anonymous namespace)::throw_on_error()
    @     0x563f69b37a16 w2l::ConnectionistTemporalClassificationCriterion::forward()
    @     0x563f699d3d30 _ZZ4mainENKUlSt10shared_ptrIN2fl6ModuleEES_IN3w2l17SequenceCriterionEES_INS3_10W2lDatasetEES_INS0_19FirstOrderOptimizerEES9_ddblE3_clES2_S5_S7_S9_S9_ddbl
    @     0x563f699674d8 main
    @     0x7f346b22fb97 __libc_start_main
    @     0x563f699cde4a _start
Aborted (core dumped)

Any comments about this bug? Is it because the duration of my audios, around 10 minutes, is too long? Thank you!

abhinavkulkarni commented 3 years ago

@ML6634: This means you are running low on the GPU memory. You can monitor the GPU memory usage by running watch nvidia-smi command. This is not a bug, but simply a limitation of resources on your system.

ML6634 commented 3 years ago

I have:

ml@ml-Alienware-Aurora-Ryzen-Edition:~$ nvidia-smi
Mon Nov 23 01:54:01 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  On   | 00000000:0B:00.0  On |                  N/A |
| 18%   31C    P8     1W / 250W |    381MiB / 11011MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3806      G   /usr/lib/xorg/Xorg                 35MiB |
|    0   N/A  N/A      6870      G   /usr/lib/xorg/Xorg                260MiB |
|    0   N/A  N/A      7098      G   /usr/bin/gnome-shell               43MiB |
|    0   N/A  N/A      8585      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A      8885      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A      9930      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     12009      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     22122      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     28791      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A    161973      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A    171677      G   /usr/lib/firefox/firefox            3MiB |
+-----------------------------------------------------------------------------+

Any way for me to still run the training? Thank you!

abhinavkulkarni commented 3 years ago

@ML6634: Run the command watch nvidia-smi while the training/decoding loop is going on. The watch prefix updates the command output every so often, so as your data is being processed in the GPU, you can see the progression of memory usage and verify that indeed at some point, your batch exceeds the GPU memory.

ML6634 commented 3 years ago

Thank @abhinavkulkarni for the help! I ran the training quite a few times. For most of them, I saw that the training took 9027 MiB GPU for the highest number which showed up. For some runs, it could be as high as 9357 MiB GPU.

On my computer, for the default

--train=/root/w2l/lists/train-clean-100.lst,/root/w2l/lists/train-clean-360.lst,/root/w2l/lists/train-other-500.lst

the training went through even for

--batchsize=4
Now I am training it using 20 telephony audios instead, each of which is around 10 minutes long. I think the training data is smaller. Why does it cause a GPU memory issue? Any possible way for me to taking care of it? Thank you!

abhinavkulkarni commented 3 years ago

Now I am training it using 20 telephony audios instead

Did you mean a mini-batch size of 20? If so, you may want to round it a power of 2 - such as 2, 4, 16, 32, etc.

Any possible way for me to taking care of it?

You can try splitting your audio into chunks of 15-45s as described here: https://github.com/facebookresearch/wav2letter/issues/797#issuecomment-686875994

ML6634 commented 3 years ago

Thank @abhinavkulkarni for the help!

Did you mean a mini-batch size of 20? If so, you may want to round it a power of 2 - such as 2, 4, 16, 32, etc.

I just meant that my

--train

is a list of 20 audios. The duration of each of these 20 audios is around 10 minutes. Which one you suggest me to round to a power of 2?

I plan to split the audios and transcripts to chunks of 15-45 seconds. Any software or way would you like to recommend me to use for splitting them? Thank you!

tlikhomanenko commented 3 years ago

The problem is not the size of your train, the problem is in the batch size. For Librispeech mostly every audio is less than 36 seconds. So now you can compute that if on Librispeech you had batchsize=6 for one GPU total audio samples duration was less than 3.6 min. And now even with one audio sample (batchsize=1) you probably will have OOM because it is 10min. Either you can use batchsize=1 (if 10min audio can fit into memory) or do segmentation of original audio into chunks. You can use our tools for this here https://github.com/facebookresearch/wav2letter/tree/v0.2/tools#voice-activity-detection-with-ctc--an-n-gram-language-model

ML6634 commented 3 years ago

Thank @tlikhomanenko for the helpful comments!

tlikhomanenko commented 3 years ago

Closing, feel free to create another issues if it is needed.