Open lalimili6 opened 2 years ago
This likely has something to do with the acoustic or LM scale options. Show the full command line that you used.
Here is log file: steps/nnet3/decode_semisup.sh
# nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_test_hires//ivector_online.scp --online-ivector-period=10 --frame-subsampling-factor=3 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --word-determinize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=1.0 --allow-partial=true --word-symbol-table=model/graph_test//words.txt --determinize-lattice=false model_online/final.mdl model/graph_test//HCLG.fst "ark,s,cs:apply-cmvn --utt2spk=ark:data/test_hires/split4/1/utt2spk scp:data/test_hires/split4/1/cmvn.scp scp:data/test_hires/split4/1/feats.scp ark:- |" "ark:| lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:- | lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:- | gzip -c >model_online/decode__test_semiup/lat.1.gz"
# Started at Sat Mar 12 08:58:58 UTC 2022
#
nnet3-latgen-faster --online-ivectors=scp:exp/nnet3/ivectors_test_hires//ivector_online.scp --online-ivector-period=10 --frame-subsampling-factor=3 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --word-determinize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=1.0 --allow-partial=true --word-symbol-table=model/graph_test//words.txt --determinize-lattice=false model_online/final.mdl model/graph_test//HCLG.fst 'ark,s,cs:apply-cmvn --utt2spk=ark:data/test_hires/split4/1/utt2spk scp:data/test_hires/split4/1/cmvn.scp scp:data/test_hires/split4/1/feats.scp ark:- |' 'ark:| lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:- | lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:- | gzip -c >model_online/decode__test_semiup/lat.1.gz'
LOG (nnet3-latgen-faster[5.5.1002~1546-4609e]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (nnet3-latgen-faster[5.5.1002~1546-4609e]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:-
lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:-
apply-cmvn --utt2spk=ark:data/test_hires/split4/1/utt2spk scp:data/test_hires/split4/1/cmvn.scp scp:data/test_hires/split4/1/feats.scp ark:-
gpu_decode.sh
batched-wav-nnet3-cuda2 --write-lattice=true --frames-per-chunk=140 --extra-left-context-initial=0 --frame-subsampling-factor=3 --config=model_online/conf/online.conf --max-active=7000 --beam=15.0 --lattice-beam=6.0 --acoustic-scale=1.0 --word-symbol-table=model/graph_test/words.txt model_online/final.mdl model/graph_test/HCLG.fst "ark,s,cs:extract-segments scp,p:data/test/split1/1/wav.scp data/test/split1/1/segments ark:- |" "ark:|lattice-scale --acoustic-scale=10.0 ark:- ark:- | gzip -c >model_online/decode_test_gpu/lat.1.gz"
# Started at Wed Mar 16 15:24:21 UTC 2022
#
batched-wav-nnet3-cuda2 --write-lattice=true --frames-per-chunk=140 --extra-left-context-initial=0 --frame-subsampling-factor=3 --config=model_online/conf/online.conf --max-active=7000 --beam=15.0 --lattice-beam=6.0 --acoustic-scale=1.0 --word-symbol-table=model/graph_test/words.txt model_online/final.mdl model/graph_test/HCLG.fst 'ark,s,cs:extract-segments scp,p:data/test/split1/1/wav.scp data/test/split1/1/segments ark:- |' 'ark:|lattice-scale --acoustic-scale=10.0 ark:- ark:- | gzip -c >model_online/decode_test_gpu/lat.1.gz'
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuId():cu-device.cc:238) CUDA setup operating under Compute Exclusive Mode.
LOG (batched-wav-nnet3-cuda2[5.5]:FinalizeActiveGpu():cu-device.cc:338) The active GPU is [0]: GeForce RTX 3090 free:23424M, used:843M, total:24268M, free/total:0.965232 version 8.6
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
lattice-scale --acoustic-scale=10.0 ark:- ark:-
LOG (batched-wav-nnet3-cuda2[5.5]:CheckAndFixConfigs():nnet3/nnet-am-decodable-simple.h:123) Increasing --frames-per-chunk from 140 to 141 to make it a multiple of --frame-subsampling-factor=3
decode_gpu_semiup.sh
# batched-wav-nnet3-cuda2 --write-lattice=true --minimize=false --word-determinize=false --frames-per-chunk=141 --extra-left-context-initial=0 --frame-subsampling-factor=3 --config=model_online/conf/online.conf --max-active=7000 --beam=25.0 --lattice-beam=15.0 --acoustic-scale=1.0 --word-symbol-table=model/graph_test/words.txt --determinize-lattice=false model_online/final.mdl model/graph_test/HCLG.fst "ark,s,cs:extract-segments scp,p:data/test/split1/1/wav.scp data/test/split1/1/segments ark:- |" "ark:| lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:- | lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:- | gzip -c >model_online/decode_test_gpu_semiup/lat.1.gz"
# Started at Wed Mar 16 15:03:11 UTC 2022
#
batched-wav-nnet3-cuda2 --write-lattice=true --minimize=false --word-determinize=false --frames-per-chunk=141 --extra-left-context-initial=0 --frame-subsampling-factor=3 --config=model_online/conf/online.conf --max-active=7000 --beam=25.0 --lattice-beam=15.0 --acoustic-scale=1.0 --word-symbol-table=model/graph_test/words.txt --determinize-lattice=false model_online/final.mdl model/graph_test/HCLG.fst 'ark,s,cs:extract-segments scp,p:data/test/split1/1/wav.scp data/test/split1/1/segments ark:- |' 'ark:| lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:- | lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:- | gzip -c >model_online/decode_test_gpu_semiup/lat.1.gz'
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuId():cu-device.cc:238) CUDA setup operating under Compute Exclusive Mode.
LOG (batched-wav-nnet3-cuda2[5.5]:FinalizeActiveGpu():cu-device.cc:338) The active GPU is [0]: GeForce RTX 3090 free:23424M, used:843M, total:24268M, free/total:0.965232 version 8.6
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
lattice-scale --acoustic-scale=10.0 --write-compact=false ark:- ark:-
lattice-determinize-phone-pruned --beam=8.0 --acoustic-scale=1.0 --minimize=false --word-determinize=false --write-compact=false model_online/final.mdl ark:- ark:-
You apply CMN in nnet3/decode_semisup.sh but not in gpu_semisup.sh (variance normalization if off, mean on by default in apply-cmvn
). Is this the root cause?
You apply CMN in nnet3/decode_semisup.sh but not in gpu_semisup.sh (variance normalization if off, mean on by default in
apply-cmvn
). Is this the root cause?
GPU decoding (batched-wav-nnet3-cuda2 with or not lattice-determinize-phone-pruned) get online conf like online2-wav-nnet3-latgen-faster and wont set aply-cmvn for feats and set cmvn in online.conf.
since GPU without lattice-determinize-phone-pruned decoding has the same results as CPU decoding, I think lattice generates in GPU decoding has a problem; because it has an error first:
ERROR (lattice-determinize-phone-pruned[5.5.0~1539-ea2b]:LatticeStateTimes():lattice-functions.cc:81) Input lattice must be topologically sorted.
It means GPU lattices don't sort.
I think lattices are supposed to be top-sorted when written, possibly the GPU decoder is not doing that, maybe as some kind of optimization. However we could perhaps add TopSortLatticeIfNeeded() to lattice-determinize-phone-pruned.cc as a work-around.
Many thanks. Yes I did that and add TopSortLatticeIfNeeded() to latbin/lattice-determinize-phone-pruned.cc line 104. is it true? but results like my first post.
Ah, you're certainly right about the CMN with a CUDA batch decoder! /Internally, the "batch" decoder is built on top of the online decoder. Could you please run an apples-to-apples test on CPU with online2-wav-nnet3-latgen-faster
instead of nnet3-latgen-faster
, to make it as close to the CUDA case as possible?
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
I want to use the Cuda decoder ( batched-wav-nnet3-cuda2) in Kaldi for semi-supervised(semiup) decoding. I wrote two scripts the first one used the Cuda decoder and then added "lattice-determinize-phone-pruned" to that script to use lattices for semiup decoding like semiup decoding in this script.
gpu_decode2.sh.txt gpu_decode2_semiup.sh.txt
1- I get this Error for semiup_gpu decoding:
ERROR (lattice-determinize-phone-pruned[5.5.0~1539-ea2b]:LatticeStateTimes():lattice-functions.cc:81) Input lattice must be topologically sorted.
I added TopSortLatticeIfNeeded(&lat); to this script and Error fix.2- I computed WER for CPU, Cuda, and semiup_script. the WERs are almost the same in CPU and Cuda decoding but when adding "lattice-determinize-phone-pruned" to Cuda decoding WER gets worse.
In Cuda decder with "attice-determinize-phone-pruned" (gpu_decode2_semiup.sh) I set beam=25.0, lattice_beam=15.0, beam_determinize=8.0 however If I set them like cuda decoder (beam=15.0, lattice_beam=6.0) the result is 55% wer.
It means if I decode a silene wave with "batched-wav-nnet3-cuda2 | lattice-determinize-phone-pruned", It decodes always a word!!!! how can fix it? Does my script wrong to use Cuda decoder and lattice-determinize-phone-pruned?
best regards