Open glynpu opened 3 years ago
Here is the log when program crash while decoding test-other:
INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%)
INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%)
INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%)
[F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in
t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4
[ Stack-Trace: ]
/ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187]
python(PyCFunction_Call+0x56) [0x5ff8a6]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
/ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7
8d]
python(PyCFunction_Call+0xfb) [0x5ff94b]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python() [0x662c2e]
python(PyRun_FileExFlags+0x97) [0x662d07]
python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f]
Traceback (most recent call last):
File "bpe_ctc_att_conformer_decode.py", line 617, in <module>
File "bpe_ctc_att_conformer_decode.py", line 576, in main
model=model,
File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "bpe_ctc_att_conformer_decode.py", line 278, in decode
model=model,
File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch
lm_scale_list)
File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w
ith_whole_lattice
best_paths = k2.shortest_path(inv_lats, use_double_scores=True)
File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path
out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map)
File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor
setattr(dest, name, index_select(value, arc_map,
File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select
ans = _IndexSelectFunction.apply(src, index, default_value)
File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward
return _k2.index_select(src, index, default_value)
RuntimeError: Some bad things happed.
Here is the log when program crash while decoding test-other:
INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%) INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%) INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%) [F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4 [ Stack-Trace: ] /ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187] python(PyCFunction_Call+0x56) [0x5ff8a6] python(_PyObject_MakeTpCall+0x28f) [0x5fff6f] python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(PyVectorcall_Call+0x51) [0x5ff3b1] /ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7 8d] python(PyCFunction_Call+0xfb) [0x5ff94b] python(_PyObject_MakeTpCall+0x28f) [0x5fff6f] python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(PyVectorcall_Call+0x51) [0x5ff3b1] python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x247) [0x602bd7] python(_PyEval_EvalFrameDefault+0x619) [0x578dd9] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(PyVectorcall_Call+0x51) [0x5ff3b1] python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x247) [0x602bd7] python(_PyEval_EvalFrameDefault+0x619) [0x578dd9] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python() [0x662c2e] python(PyRun_FileExFlags+0x97) [0x662d07] python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f] Traceback (most recent call last): File "bpe_ctc_att_conformer_decode.py", line 617, in <module> File "bpe_ctc_att_conformer_decode.py", line 576, in main model=model, File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "bpe_ctc_att_conformer_decode.py", line 278, in decode model=model, File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch lm_scale_list) File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w ith_whole_lattice best_paths = k2.shortest_path(inv_lats, use_double_scores=True) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor setattr(dest, name, index_select(value, arc_map, File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select ans = _IndexSelectFunction.apply(src, index, default_value) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward return _k2.index_select(src, index, default_value) RuntimeError: Some bad things happed.
Will have a look. Probably tomorrow.
Can you find the code where it gets 'index' from? Possibly we failed to do clone() at some point to make it a stride-1 tensor if it came from an FSA (but it's still very odd). You may be able to replicate the failure in pdb and debug it that way (let me know by wechat if when run in pdb shows an error, because I may be able to remember the fix).
The line numbers in utils.py don't seem to match with the current master.
Result of n-best rescore with transformer decoder:
Wer% on test_clean | wer% on test_other | |
---|---|---|
Encoder + ctc | 3.32 | 7.96 |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore | 2.92 | *(failed when decoding, working on this) |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 | 2.87 | *(to be tested) |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 | 2.86 | *(to be tested) |
Detail errors
num-paths-for-decoder-rescore=100
INFO:root:[test-clean-lm_scale_0.6] %WER 2.87% [1510 / 52576, 207 ins, 130 del, 1173 sub ]
num-paths-for-decoder-rescore=500
INFO:root:[test-clean-lm_scale_0.6] %WER 2.86% [1505 / 52576, 207 ins, 128 del, 1170 sub ]
What is the LM scale? I would imagine that when using the transformer decoder, we'd need to scale down the LM probabilities, because that decoder would already account for the LM prob.
What is the LM scale?
currently no scale. as:
tot_scores = am_scores + fgram_lm_scores + decoder_scores
we'd need to scale down the LM probabilities, because that decoder would already account for the LM prob.
Do you mean assign a weight less than one to lm_scores? like this:
- tot_scores = am_scores + fgram_lm_scores + decoder_scores
+ lm_score_weight = 0.6 # just a value less than one
+ decoder_score_weight = 0.7 # just a value less than one
+ tot_scores = am_scores + lm_score_weight * fgram_lm_scores + decoder_score_weight * decoder_scores
Do you mean assign a weight less than one to lm_scores? like this:
- tot_scores = am_scores + fgram_lm_scores + decoder_scores
- lm_score_weight = 0.6 # just a value less than one
- decoder_score_weight = 0.7 # just a value less than one
- tot_scores = am_scores + lm_score_weight fgram_lm_scores + decoder_score_weight decoder_scores
I often see people using a combination of weights, whose sum is 1.
compute am/4-gram lm_scores with _unique_tokenseqs seems a little better than that of _unique_wordseqs, with a variety of combination of lm_scale and decoder_scale.
Wer% on test_clean | wer% on test_other | |
---|---|---|
Encoder + ctc | 3.32 | 7.96 |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore | 2.92 | *(failed when decoding, working on this) |
+transformer decoder n-best rescore computing with _unique_wordseqs | 2.87 | *(to be tested) |
+transformer decoder n-best rescore computing with _unique_tokenseqs | 2.81 | *(to be tested) |
wer of test_clean with compute_am_flm_scores_1, computing with _unique_wordseqs. decoder_scale(right)lm_scale(below) | 0.01 | 0.03 | 0.05 | 0.08 | 0.09 | 0.1 | 0.3 | 0.5 | 0.7 | 0.9 | 1.0 | 2.0 | 4.0 | 6.0 | 8.0 | 10.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1 | 3.04 | 3.02 | 3.02 | 2.99 | 3.0 | 2.99 | 3.07 | 3.17 | 3.25 | 3.33 | 3.33 | 3.52 | 3.6 | 3.66 | 3.69 | 3.71 |
0.3 | 2.96 | 2.94 | 2.94 | 2.94 | 2.93 | 2.94 | 3.05 | 3.17 | 3.24 | 3.31 | 3.34 | 3.52 | 3.61 | 3.66 | 3.69 | 3.72 |
0.5 | 2.93 | 2.91 | 2.89 | 2.88 | 2.87 | 2.89 | 3.04 | 3.14 | 3.27 | 3.33 | 3.36 | 3.52 | 3.62 | 3.67 | 3.7 | 3.71 |
0.6 | 2.91 | 2.89 | 2.88 | 2.89 | 2.88 | 2.89 | 3.04 | 3.16 | 3.26 | 3.34 | 3.37 | 3.53 | 3.62 | 3.67 | 3.7 | 3.72 |
0.7 | 2.93 | 2.93 | 2.91 | 2.91 | 2.9 | 2.9 | 3.06 | 3.16 | 3.28 | 3.33 | 3.36 | 3.53 | 3.62 | 3.67 | 3.71 | 3.73 |
0.9 | 3.14 | 3.1 | 3.09 | 3.06 | 3.05 | 3.05 | 3.13 | 3.24 | 3.31 | 3.37 | 3.4 | 3.55 | 3.65 | 3.69 | 3.72 | 3.74 |
1.0 | 3.33 | 3.28 | 3.25 | 3.2 | 3.21 | 3.21 | 3.21 | 3.29 | 3.37 | 3.4 | 3.43 | 3.59 | 3.67 | 3.7 | 3.74 | 3.74 |
2.0 | 5.63 | 5.56 | 5.53 | 5.47 | 5.45 | 5.43 | 5.06 | 4.68 | 4.39 | 4.18 | 4.11 | 3.82 | 3.8 | 3.8 | 3.81 | 3.8 |
4.0 | 6.13 | 6.11 | 6.1 | 6.1 | 6.09 | 6.08 | 5.97 | 5.84 | 5.69 | 5.56 | 5.49 | 4.75 | 4.06 | 3.92 | 3.87 | 3.86 |
6.0 | 6.23 | 6.22 | 6.22 | 6.2 | 6.21 | 6.21 | 6.15 | 6.08 | 5.99 | 5.91 | 5.89 | 5.44 | 4.65 | 4.14 | 3.96 | 3.92 |
8.0 | 6.3 | 6.3 | 6.28 | 6.28 | 6.28 | 6.27 | 6.23 | 6.19 | 6.13 | 6.09 | 6.04 | 5.79 | 5.08 | 4.61 | 4.22 | 4.02 |
10.0 | 6.32 | 6.32 | 6.31 | 6.31 | 6.31 | 6.31 | 6.27 | 6.24 | 6.2 | 6.16 | 6.14 | 5.93 | 5.42 | 4.9 | 4.58 | 4.27 |
wer of test_clean with compute_am_flm_scores2,computing with unique_token_seqs_. decoder_scale(right)lm_scale(below) | 0.01 | 0.03 | 0.05 | 0.08 | 0.09 | 0.1 | 0.3 | 0.5 | 0.7 | 0.9 | 1.0 | 2.0 | 4.0 | 6.0 | 8.0 | 10.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1 | 3.02 | 3.0 | 2.98 | 2.95 | 2.94 | 2.94 | 2.9 | 2.87 | 2.88 | 2.89 | 2.88 | 2.89 | 2.91 | 2.92 | 2.93 | 2.94 |
0.3 | 2.97 | 2.95 | 2.93 | 2.91 | 2.9 | 2.9 | 2.85 | 2.86 | 2.86 | 2.85 | 2.86 | 2.89 | 2.9 | 2.93 | 2.93 | 2.94 |
0.5 | 2.92 | 2.92 | 2.91 | 2.88 | 2.88 | 2.88 | 2.85 | 2.82 | 2.83 | 2.85 | 2.85 | 2.88 | 2.91 | 2.93 | 2.94 | 2.94 |
0.6 | 2.92 | 2.89 | 2.9 | 2.88 | 2.86 | 2.86 | 2.84 | 2.83 | 2.83 | 2.84 | 2.85 | 2.88 | 2.92 | 2.93 | 2.94 | 2.94 |
0.7 | 2.94 | 2.93 | 2.93 | 2.9 | 2.9 | 2.89 | 2.82 | 2.82 | 2.83 | 2.84 | 2.84 | 2.89 | 2.92 | 2.93 | 2.94 | 2.94 |
0.9 | 3.14 | 3.11 | 3.07 | 3.01 | 3.0 | 2.99 | 2.88 | 2.82 | 2.81 | 2.82 | 2.84 | 2.89 | 2.93 | 2.94 | 2.94 | 2.94 |
1.0 | 3.3 | 3.25 | 3.19 | 3.14 | 3.12 | 3.11 | 2.91 | 2.85 | 2.83 | 2.82 | 2.83 | 2.89 | 2.93 | 2.94 | 2.94 | 2.95 |
2.0 | 5.53 | 5.48 | 5.45 | 5.38 | 5.35 | 5.33 | 4.7 | 4.11 | 3.72 | 3.5 | 3.39 | 2.97 | 2.93 | 2.94 | 2.94 | 2.95 |
4.0 | 6.09 | 6.08 | 6.06 | 6.05 | 6.05 | 6.04 | 5.86 | 5.6 | 5.29 | 4.98 | 4.85 | 3.95 | 3.14 | 2.98 | 2.94 | 2.95 |
6.0 | 6.19 | 6.19 | 6.19 | 6.17 | 6.18 | 6.17 | 6.08 | 5.94 | 5.79 | 5.61 | 5.5 | 4.67 | 3.76 | 3.25 | 3.02 | 2.98 |
8.0 | 6.25 | 6.25 | 6.25 | 6.25 | 6.24 | 6.23 | 6.16 | 6.08 | 5.97 | 5.87 | 5.81 | 5.18 | 4.18 | 3.67 | 3.3 | 3.09 |
10.0 | 6.28 | 6.28 | 6.27 | 6.27 | 6.27 | 6.26 | 6.21 | 6.15 | 6.08 | 5.99 | 5.94 | 5.48 | 4.54 | 3.99 | 3.63 | 3.36 |
log of compute_am_flm_scores_1:
lm_scale_0.5_decoder_scale_0.09 2.87 best for test-clean
lm_scale_0.5_decoder_scale_0.08 2.88
lm_scale_0.6_decoder_scale_0.05 2.88
lm_scale_0.6_decoder_scale_0.09 2.88
lm_scale_0.5_decoder_scale_0.1 2.89
lm_scale_0.5_decoder_scale_0.05 2.89
lm_scale_0.6_decoder_scale_0.1 2.89
lm_scale_0.6_decoder_scale_0.03 2.89
lm_scale_0.6_decoder_scale_0.08 2.89
lm_scale_0.7_decoder_scale_0.1 2.9
lm_scale_0.7_decoder_scale_0.09 2.9
lm_scale_0.5_decoder_scale_0.03 2.91
lm_scale_0.6_decoder_scale_0.01 2.91
lm_scale_0.7_decoder_scale_0.05 2.91
lm_scale_0.7_decoder_scale_0.08 2.91
lm_scale_0.3_decoder_scale_0.09 2.93
lm_scale_0.5_decoder_scale_0.01 2.93
lm_scale_0.7_decoder_scale_0.01 2.93
lm_scale_0.7_decoder_scale_0.03 2.93
lm_scale_0.3_decoder_scale_0.1 2.94
lm_scale_0.3_decoder_scale_0.03 2.94
lm_scale_0.3_decoder_scale_0.05 2.94
lm_scale_0.3_decoder_scale_0.08 2.94
lm_scale_0.3_decoder_scale_0.01 2.96
lm_scale_0.1_decoder_scale_0.1 2.99
lm_scale_0.1_decoder_scale_0.08 2.99
lm_scale_0.1_decoder_scale_0.09 3.0
lm_scale_0.1_decoder_scale_0.03 3.02
lm_scale_0.1_decoder_scale_0.05 3.02
lm_scale_0.1_decoder_scale_0.01 3.04
lm_scale_0.5_decoder_scale_0.3 3.04
lm_scale_0.6_decoder_scale_0.3 3.04
lm_scale_0.3_decoder_scale_0.3 3.05
lm_scale_0.9_decoder_scale_0.1 3.05
lm_scale_0.9_decoder_scale_0.09 3.05
lm_scale_0.7_decoder_scale_0.3 3.06
lm_scale_0.9_decoder_scale_0.08 3.06
lm_scale_0.1_decoder_scale_0.3 3.07
lm_scale_0.9_decoder_scale_0.05 3.09
lm_scale_0.9_decoder_scale_0.03 3.1
lm_scale_0.9_decoder_scale_0.3 3.13
lm_scale_0.5_decoder_scale_0.5 3.14
lm_scale_0.9_decoder_scale_0.01 3.14
lm_scale_0.6_decoder_scale_0.5 3.16
lm_scale_0.7_decoder_scale_0.5 3.16
lm_scale_0.1_decoder_scale_0.5 3.17
lm_scale_0.3_decoder_scale_0.5 3.17
lm_scale_1.0_decoder_scale_0.08 3.2
lm_scale_1.0_decoder_scale_0.1 3.21
lm_scale_1.0_decoder_scale_0.3 3.21
lm_scale_1.0_decoder_scale_0.09 3.21
lm_scale_0.3_decoder_scale_0.7 3.24
lm_scale_0.9_decoder_scale_0.5 3.24
lm_scale_0.1_decoder_scale_0.7 3.25
lm_scale_1.0_decoder_scale_0.05 3.25
lm_scale_0.6_decoder_scale_0.7 3.26
lm_scale_0.5_decoder_scale_0.7 3.27
lm_scale_0.7_decoder_scale_0.7 3.28
lm_scale_1.0_decoder_scale_0.03 3.28
lm_scale_1.0_decoder_scale_0.5 3.29
lm_scale_0.3_decoder_scale_0.9 3.31
lm_scale_0.9_decoder_scale_0.7 3.31
lm_scale_0.1_decoder_scale_0.9 3.33
lm_scale_0.1_decoder_scale_1.0 3.33
lm_scale_0.5_decoder_scale_0.9 3.33
lm_scale_0.7_decoder_scale_0.9 3.33
lm_scale_1.0_decoder_scale_0.01 3.33
lm_scale_0.3_decoder_scale_1.0 3.34
lm_scale_0.6_decoder_scale_0.9 3.34
lm_scale_0.5_decoder_scale_1.0 3.36
lm_scale_0.7_decoder_scale_1.0 3.36
lm_scale_0.6_decoder_scale_1.0 3.37
lm_scale_0.9_decoder_scale_0.9 3.37
lm_scale_1.0_decoder_scale_0.7 3.37
lm_scale_0.9_decoder_scale_1.0 3.4
lm_scale_1.0_decoder_scale_0.9 3.4
lm_scale_1.0_decoder_scale_1.0 3.43
lm_scale_0.1_decoder_scale_2.0 3.52
lm_scale_0.3_decoder_scale_2.0 3.52
lm_scale_0.5_decoder_scale_2.0 3.52
lm_scale_0.6_decoder_scale_2.0 3.53
lm_scale_0.7_decoder_scale_2.0 3.53
lm_scale_0.9_decoder_scale_2.0 3.55
lm_scale_1.0_decoder_scale_2.0 3.59
lm_scale_0.1_decoder_scale_4.0 3.6
lm_scale_0.3_decoder_scale_4.0 3.61
lm_scale_0.5_decoder_scale_4.0 3.62
lm_scale_0.6_decoder_scale_4.0 3.62
lm_scale_0.7_decoder_scale_4.0 3.62
lm_scale_0.9_decoder_scale_4.0 3.65
lm_scale_0.1_decoder_scale_6.0 3.66
lm_scale_0.3_decoder_scale_6.0 3.66
lm_scale_0.5_decoder_scale_6.0 3.67
lm_scale_0.6_decoder_scale_6.0 3.67
lm_scale_0.7_decoder_scale_6.0 3.67
lm_scale_1.0_decoder_scale_4.0 3.67
lm_scale_0.1_decoder_scale_8.0 3.69
lm_scale_0.3_decoder_scale_8.0 3.69
lm_scale_0.9_decoder_scale_6.0 3.69
lm_scale_0.5_decoder_scale_8.0 3.7
lm_scale_0.6_decoder_scale_8.0 3.7
lm_scale_1.0_decoder_scale_6.0 3.7
lm_scale_0.1_decoder_scale_10.0 3.71
lm_scale_0.5_decoder_scale_10.0 3.71
lm_scale_0.7_decoder_scale_8.0 3.71
lm_scale_0.3_decoder_scale_10.0 3.72
lm_scale_0.6_decoder_scale_10.0 3.72
lm_scale_0.9_decoder_scale_8.0 3.72
lm_scale_0.7_decoder_scale_10.0 3.73
lm_scale_0.9_decoder_scale_10.0 3.74
lm_scale_1.0_decoder_scale_8.0 3.74
lm_scale_1.0_decoder_scale_10.0 3.74
lm_scale_2.0_decoder_scale_4.0 3.8
lm_scale_2.0_decoder_scale_6.0 3.8
lm_scale_2.0_decoder_scale_10.0 3.8
lm_scale_2.0_decoder_scale_8.0 3.81
lm_scale_2.0_decoder_scale_2.0 3.82
lm_scale_4.0_decoder_scale_10.0 3.86
lm_scale_4.0_decoder_scale_8.0 3.87
lm_scale_4.0_decoder_scale_6.0 3.92
lm_scale_6.0_decoder_scale_10.0 3.92
lm_scale_6.0_decoder_scale_8.0 3.96
lm_scale_8.0_decoder_scale_10.0 4.02
lm_scale_4.0_decoder_scale_4.0 4.06
lm_scale_2.0_decoder_scale_1.0 4.11
lm_scale_6.0_decoder_scale_6.0 4.14
lm_scale_2.0_decoder_scale_0.9 4.18
lm_scale_8.0_decoder_scale_8.0 4.22
lm_scale_10.0_decoder_scale_10.0 4.27
lm_scale_2.0_decoder_scale_0.7 4.39
lm_scale_10.0_decoder_scale_8.0 4.58
lm_scale_8.0_decoder_scale_6.0 4.61
lm_scale_6.0_decoder_scale_4.0 4.65
lm_scale_2.0_decoder_scale_0.5 4.68
lm_scale_4.0_decoder_scale_2.0 4.75
lm_scale_10.0_decoder_scale_6.0 4.9
lm_scale_2.0_decoder_scale_0.3 5.06
lm_scale_8.0_decoder_scale_4.0 5.08
lm_scale_10.0_decoder_scale_4.0 5.42
lm_scale_2.0_decoder_scale_0.1 5.43
lm_scale_6.0_decoder_scale_2.0 5.44
lm_scale_2.0_decoder_scale_0.09 5.45
lm_scale_2.0_decoder_scale_0.08 5.47
lm_scale_4.0_decoder_scale_1.0 5.49
lm_scale_2.0_decoder_scale_0.05 5.53
lm_scale_2.0_decoder_scale_0.03 5.56
lm_scale_4.0_decoder_scale_0.9 5.56
lm_scale_2.0_decoder_scale_0.01 5.63
lm_scale_4.0_decoder_scale_0.7 5.69
lm_scale_8.0_decoder_scale_2.0 5.79
lm_scale_4.0_decoder_scale_0.5 5.84
lm_scale_6.0_decoder_scale_1.0 5.89
lm_scale_6.0_decoder_scale_0.9 5.91
lm_scale_10.0_decoder_scale_2.0 5.93
lm_scale_4.0_decoder_scale_0.3 5.97
lm_scale_6.0_decoder_scale_0.7 5.99
lm_scale_8.0_decoder_scale_1.0 6.04
lm_scale_4.0_decoder_scale_0.1 6.08
lm_scale_6.0_decoder_scale_0.5 6.08
lm_scale_4.0_decoder_scale_0.09 6.09
lm_scale_8.0_decoder_scale_0.9 6.09
lm_scale_4.0_decoder_scale_0.05 6.1
lm_scale_4.0_decoder_scale_0.08 6.1
lm_scale_4.0_decoder_scale_0.03 6.11
lm_scale_4.0_decoder_scale_0.01 6.13
lm_scale_8.0_decoder_scale_0.7 6.13
lm_scale_10.0_decoder_scale_1.0 6.14
lm_scale_6.0_decoder_scale_0.3 6.15
lm_scale_10.0_decoder_scale_0.9 6.16
lm_scale_8.0_decoder_scale_0.5 6.19
lm_scale_6.0_decoder_scale_0.08 6.2
lm_scale_10.0_decoder_scale_0.7 6.2
lm_scale_6.0_decoder_scale_0.1 6.21
lm_scale_6.0_decoder_scale_0.09 6.21
lm_scale_6.0_decoder_scale_0.03 6.22
lm_scale_6.0_decoder_scale_0.05 6.22
lm_scale_6.0_decoder_scale_0.01 6.23
lm_scale_8.0_decoder_scale_0.3 6.23
lm_scale_10.0_decoder_scale_0.5 6.24
lm_scale_8.0_decoder_scale_0.1 6.27
lm_scale_10.0_decoder_scale_0.3 6.27
lm_scale_8.0_decoder_scale_0.05 6.28
lm_scale_8.0_decoder_scale_0.08 6.28
lm_scale_8.0_decoder_scale_0.09 6.28
lm_scale_8.0_decoder_scale_0.01 6.3
lm_scale_8.0_decoder_scale_0.03 6.3
lm_scale_10.0_decoder_scale_0.1 6.31
lm_scale_10.0_decoder_scale_0.05 6.31
lm_scale_10.0_decoder_scale_0.08 6.31
lm_scale_10.0_decoder_scale_0.09 6.31
lm_scale_10.0_decoder_scale_0.01 6.32
lm_scale_10.0_decoder_scale_0.03 6.32
log of compute_am_flm_scores_2
lm_scale_0.9_decoder_scale_0.7 2.81 best for test-clean
lm_scale_0.5_decoder_scale_0.5 2.82
lm_scale_0.7_decoder_scale_0.3 2.82
lm_scale_0.7_decoder_scale_0.5 2.82
lm_scale_0.9_decoder_scale_0.5 2.82
lm_scale_0.9_decoder_scale_0.9 2.82
lm_scale_1.0_decoder_scale_0.9 2.82
lm_scale_0.5_decoder_scale_0.7 2.83
lm_scale_0.6_decoder_scale_0.5 2.83
lm_scale_0.6_decoder_scale_0.7 2.83
lm_scale_0.7_decoder_scale_0.7 2.83
lm_scale_1.0_decoder_scale_0.7 2.83
lm_scale_1.0_decoder_scale_1.0 2.83
lm_scale_0.6_decoder_scale_0.3 2.84
lm_scale_0.6_decoder_scale_0.9 2.84
lm_scale_0.7_decoder_scale_0.9 2.84
lm_scale_0.7_decoder_scale_1.0 2.84
lm_scale_0.9_decoder_scale_1.0 2.84
lm_scale_0.3_decoder_scale_0.3 2.85
lm_scale_0.3_decoder_scale_0.9 2.85
lm_scale_0.5_decoder_scale_0.3 2.85
lm_scale_0.5_decoder_scale_0.9 2.85
lm_scale_0.5_decoder_scale_1.0 2.85
lm_scale_0.6_decoder_scale_1.0 2.85
lm_scale_1.0_decoder_scale_0.5 2.85
lm_scale_0.3_decoder_scale_0.5 2.86
lm_scale_0.3_decoder_scale_0.7 2.86
lm_scale_0.3_decoder_scale_1.0 2.86
lm_scale_0.6_decoder_scale_0.1 2.86
lm_scale_0.6_decoder_scale_0.09 2.86
lm_scale_0.1_decoder_scale_0.5 2.87
lm_scale_0.1_decoder_scale_0.7 2.88
lm_scale_0.1_decoder_scale_1.0 2.88
lm_scale_0.5_decoder_scale_0.1 2.88
lm_scale_0.5_decoder_scale_2.0 2.88
lm_scale_0.5_decoder_scale_0.08 2.88
lm_scale_0.5_decoder_scale_0.09 2.88
lm_scale_0.6_decoder_scale_2.0 2.88
lm_scale_0.6_decoder_scale_0.08 2.88
lm_scale_0.9_decoder_scale_0.3 2.88
lm_scale_0.1_decoder_scale_0.9 2.89
lm_scale_0.1_decoder_scale_2.0 2.89
lm_scale_0.3_decoder_scale_2.0 2.89
lm_scale_0.6_decoder_scale_0.03 2.89
lm_scale_0.7_decoder_scale_0.1 2.89
lm_scale_0.7_decoder_scale_2.0 2.89
lm_scale_0.9_decoder_scale_2.0 2.89
lm_scale_1.0_decoder_scale_2.0 2.89
lm_scale_0.1_decoder_scale_0.3 2.9
lm_scale_0.3_decoder_scale_0.1 2.9
lm_scale_0.3_decoder_scale_4.0 2.9
lm_scale_0.3_decoder_scale_0.09 2.9
lm_scale_0.6_decoder_scale_0.05 2.9
lm_scale_0.7_decoder_scale_0.08 2.9
lm_scale_0.7_decoder_scale_0.09 2.9
lm_scale_0.1_decoder_scale_4.0 2.91
lm_scale_0.3_decoder_scale_0.08 2.91
lm_scale_0.5_decoder_scale_4.0 2.91
lm_scale_0.5_decoder_scale_0.05 2.91
lm_scale_1.0_decoder_scale_0.3 2.91
lm_scale_0.1_decoder_scale_6.0 2.92
lm_scale_0.5_decoder_scale_0.01 2.92
lm_scale_0.5_decoder_scale_0.03 2.92
lm_scale_0.6_decoder_scale_4.0 2.92
lm_scale_0.6_decoder_scale_0.01 2.92
lm_scale_0.7_decoder_scale_4.0 2.92
lm_scale_0.1_decoder_scale_8.0 2.93
lm_scale_0.3_decoder_scale_6.0 2.93
lm_scale_0.3_decoder_scale_8.0 2.93
lm_scale_0.3_decoder_scale_0.05 2.93
lm_scale_0.5_decoder_scale_6.0 2.93
lm_scale_0.6_decoder_scale_6.0 2.93
lm_scale_0.7_decoder_scale_6.0 2.93
lm_scale_0.7_decoder_scale_0.03 2.93
lm_scale_0.7_decoder_scale_0.05 2.93
lm_scale_0.9_decoder_scale_4.0 2.93
lm_scale_1.0_decoder_scale_4.0 2.93
lm_scale_2.0_decoder_scale_4.0 2.93
lm_scale_0.1_decoder_scale_0.1 2.94
lm_scale_0.1_decoder_scale_10.0 2.94
lm_scale_0.1_decoder_scale_0.09 2.94
lm_scale_0.3_decoder_scale_10.0 2.94
lm_scale_0.5_decoder_scale_8.0 2.94
lm_scale_0.5_decoder_scale_10.0 2.94
lm_scale_0.6_decoder_scale_8.0 2.94
lm_scale_0.6_decoder_scale_10.0 2.94
lm_scale_0.7_decoder_scale_8.0 2.94
lm_scale_0.7_decoder_scale_10.0 2.94
lm_scale_0.7_decoder_scale_0.01 2.94
lm_scale_0.9_decoder_scale_6.0 2.94
lm_scale_0.9_decoder_scale_8.0 2.94
lm_scale_0.9_decoder_scale_10.0 2.94
lm_scale_1.0_decoder_scale_6.0 2.94
lm_scale_1.0_decoder_scale_8.0 2.94
lm_scale_2.0_decoder_scale_6.0 2.94
lm_scale_2.0_decoder_scale_8.0 2.94
lm_scale_4.0_decoder_scale_8.0 2.94
lm_scale_0.1_decoder_scale_0.08 2.95
lm_scale_0.3_decoder_scale_0.03 2.95
lm_scale_1.0_decoder_scale_10.0 2.95
lm_scale_2.0_decoder_scale_10.0 2.95
lm_scale_4.0_decoder_scale_10.0 2.95
lm_scale_0.3_decoder_scale_0.01 2.97
lm_scale_2.0_decoder_scale_2.0 2.97
lm_scale_0.1_decoder_scale_0.05 2.98
lm_scale_4.0_decoder_scale_6.0 2.98
lm_scale_6.0_decoder_scale_10.0 2.98
lm_scale_0.9_decoder_scale_0.1 2.99
lm_scale_0.1_decoder_scale_0.03 3.0
lm_scale_0.9_decoder_scale_0.09 3.0
lm_scale_0.9_decoder_scale_0.08 3.01
lm_scale_0.1_decoder_scale_0.01 3.02
lm_scale_6.0_decoder_scale_8.0 3.02
lm_scale_0.9_decoder_scale_0.05 3.07
lm_scale_8.0_decoder_scale_10.0 3.09
lm_scale_0.9_decoder_scale_0.03 3.11
lm_scale_1.0_decoder_scale_0.1 3.11
lm_scale_1.0_decoder_scale_0.09 3.12
lm_scale_0.9_decoder_scale_0.01 3.14
lm_scale_1.0_decoder_scale_0.08 3.14
lm_scale_4.0_decoder_scale_4.0 3.14
lm_scale_1.0_decoder_scale_0.05 3.19
lm_scale_1.0_decoder_scale_0.03 3.25
lm_scale_6.0_decoder_scale_6.0 3.25
lm_scale_1.0_decoder_scale_0.01 3.3
lm_scale_8.0_decoder_scale_8.0 3.3
lm_scale_10.0_decoder_scale_10.0 3.36
lm_scale_2.0_decoder_scale_1.0 3.39
lm_scale_2.0_decoder_scale_0.9 3.5
lm_scale_10.0_decoder_scale_8.0 3.63
lm_scale_8.0_decoder_scale_6.0 3.67
lm_scale_2.0_decoder_scale_0.7 3.72
lm_scale_6.0_decoder_scale_4.0 3.76
lm_scale_4.0_decoder_scale_2.0 3.95
lm_scale_10.0_decoder_scale_6.0 3.99
lm_scale_2.0_decoder_scale_0.5 4.11
lm_scale_8.0_decoder_scale_4.0 4.18
lm_scale_10.0_decoder_scale_4.0 4.54
lm_scale_6.0_decoder_scale_2.0 4.67
lm_scale_2.0_decoder_scale_0.3 4.7
lm_scale_4.0_decoder_scale_1.0 4.85
lm_scale_4.0_decoder_scale_0.9 4.98
lm_scale_8.0_decoder_scale_2.0 5.18
lm_scale_4.0_decoder_scale_0.7 5.29
lm_scale_2.0_decoder_scale_0.1 5.33
lm_scale_2.0_decoder_scale_0.09 5.35
lm_scale_2.0_decoder_scale_0.08 5.38
lm_scale_2.0_decoder_scale_0.05 5.45
lm_scale_2.0_decoder_scale_0.03 5.48
lm_scale_10.0_decoder_scale_2.0 5.48
lm_scale_6.0_decoder_scale_1.0 5.5
lm_scale_2.0_decoder_scale_0.01 5.53
lm_scale_4.0_decoder_scale_0.5 5.6
lm_scale_6.0_decoder_scale_0.9 5.61
lm_scale_6.0_decoder_scale_0.7 5.79
lm_scale_8.0_decoder_scale_1.0 5.81
lm_scale_4.0_decoder_scale_0.3 5.86
lm_scale_8.0_decoder_scale_0.9 5.87
lm_scale_6.0_decoder_scale_0.5 5.94
lm_scale_10.0_decoder_scale_1.0 5.94
lm_scale_8.0_decoder_scale_0.7 5.97
lm_scale_10.0_decoder_scale_0.9 5.99
lm_scale_4.0_decoder_scale_0.1 6.04
lm_scale_4.0_decoder_scale_0.08 6.05
lm_scale_4.0_decoder_scale_0.09 6.05
lm_scale_4.0_decoder_scale_0.05 6.06
lm_scale_4.0_decoder_scale_0.03 6.08
lm_scale_6.0_decoder_scale_0.3 6.08
lm_scale_8.0_decoder_scale_0.5 6.08
lm_scale_10.0_decoder_scale_0.7 6.08
lm_scale_4.0_decoder_scale_0.01 6.09
lm_scale_10.0_decoder_scale_0.5 6.15
lm_scale_8.0_decoder_scale_0.3 6.16
lm_scale_6.0_decoder_scale_0.1 6.17
lm_scale_6.0_decoder_scale_0.08 6.17
lm_scale_6.0_decoder_scale_0.09 6.18
lm_scale_6.0_decoder_scale_0.01 6.19
lm_scale_6.0_decoder_scale_0.03 6.19
lm_scale_6.0_decoder_scale_0.05 6.19
lm_scale_10.0_decoder_scale_0.3 6.21
lm_scale_8.0_decoder_scale_0.1 6.23
lm_scale_8.0_decoder_scale_0.09 6.24
lm_scale_8.0_decoder_scale_0.01 6.25
lm_scale_8.0_decoder_scale_0.03 6.25
lm_scale_8.0_decoder_scale_0.05 6.25
lm_scale_8.0_decoder_scale_0.08 6.25
lm_scale_10.0_decoder_scale_0.1 6.26
lm_scale_10.0_decoder_scale_0.05 6.27
lm_scale_10.0_decoder_scale_0.08 6.27
lm_scale_10.0_decoder_scale_0.09 6.27
lm_scale_10.0_decoder_scale_0.01 6.28
lm_scale_10.0_decoder_scale_0.03 6.28
I just want to make sure you know how to get the unique token sequences from paths in the FSA. (Not sure if this is
something that needs fixing, sorry).
By unique token sequences I mean without the repeats that come from the CTC, topo, or the epsilons.
The way to do this is to use inner_labels='tokens'
or something like that when doing the composition with the CTC
topo during graph construction, and then use fsa.tokens
to obtain these from the lattices when you need them. Any other way may not be correct if we are using the new/simplified CTC topo, because any repeats of the same token will be converted into a single token, so certain words or word-sequences might become impossible to recognize.
Here is the log when program crash while decoding test-other:
INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%) INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%) INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%) [F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4 [ Stack-Trace: ] /ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14] /ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187] python(PyCFunction_Call+0x56) [0x5ff8a6] python(_PyObject_MakeTpCall+0x28f) [0x5fff6f] python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(PyVectorcall_Call+0x51) [0x5ff3b1] /ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7 8d] python(PyCFunction_Call+0xfb) [0x5ff94b] python(_PyObject_MakeTpCall+0x28f) [0x5fff6f] python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyFunction_Vectorcall+0x19c) [0x602b2c] python(PyVectorcall_Call+0x51) [0x5ff3b1] python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x247) [0x602bd7] python(_PyEval_EvalFrameDefault+0x619) [0x578dd9] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(PyVectorcall_Call+0x51) [0x5ff3b1] python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x442) [0x602dd2] python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python(_PyFunction_Vectorcall+0x247) [0x602bd7] python(_PyEval_EvalFrameDefault+0x619) [0x578dd9] python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec] python() [0x662c2e] python(PyRun_FileExFlags+0x97) [0x662d07] python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f] Traceback (most recent call last): File "bpe_ctc_att_conformer_decode.py", line 617, in <module> File "bpe_ctc_att_conformer_decode.py", line 576, in main model=model, File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "bpe_ctc_att_conformer_decode.py", line 278, in decode model=model, File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch lm_scale_list) File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w ith_whole_lattice best_paths = k2.shortest_path(inv_lats, use_double_scores=True) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor setattr(dest, name, index_select(value, arc_map, File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select ans = _IndexSelectFunction.apply(src, index, default_value) File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward return _k2.index_select(src, index, default_value) RuntimeError: Some bad things happed.
Did you use a batch size of 1? If your decoding result is an empty FSA, you will encounter this kind of error
when calling k2.shortest_path
. The solution is to return rescoring_lats
directly.
https://github.com/k2-fsa/snowfall/blob/5c979cce1b6a9c9bf72ec484746143b321ae73a7/snowfall/decoding/lm_rescore.py#L306
The reason is that the following line https://github.com/k2-fsa/k2/blob/069425e301472e7ea31ea982ba2a943ac5fcb649/k2/python/k2/fsa.py#L894
if src_name == 'labels':
value = value.clone()
returns a tensor with stride == 4 if value is empty.
We should modify the code that crashes to be insensitive to the stride if any of the dims is zero. Kangwei, perhaps you could do that?
We should modify the code that crashes to be insensitive to the stride if any of the dims is zero. Kangwei, perhaps you could do that?
Sure.
I just want to make sure you know how to get the unique token sequences from paths in the FSA. (Not sure if this is something that needs fixing, sorry).
After removing repeat tokens and use log_semiring=False, wer on test-clean decrease from 2.81(last week) to 2.73(now).
details result with different scale combination:
decoder_scale(right)lm_scale(below) | 0.1 | 0.3 | 0.5 | 0.6 | 0.7 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.5 | 1.7 | 1.9 | 2.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1 | 2.98 | 2.95 | 2.92 | 2.9 | 2.9 | 2.89 | 2.89 | 2.88 | 2.87 | 2.86 | 2.85 | 2.85 | 2.85 | 2.84 |
0.3 | 2.91 | 2.88 | 2.88 | 2.88 | 2.87 | 2.87 | 2.85 | 2.85 | 2.85 | 2.85 | 2.84 | 2.84 | 2.83 | 2.83 |
0.5 | 2.88 | 2.86 | 2.83 | 2.84 | 2.84 | 2.84 | 2.83 | 2.84 | 2.83 | 2.82 | 2.82 | 2.83 | 2.83 | 2.83 |
0.6 | 2.86 | 2.82 | 2.82 | 2.81 | 2.82 | 2.82 | 2.82 | 2.82 | 2.82 | 2.81 | 2.81 | 2.82 | 2.82 | 2.82 |
0.7 | 2.87 | 2.8 | 2.78 | 2.79 | 2.8 | 2.81 | 2.81 | 2.8 | 2.8 | 2.8 | 2.8 | 2.82 | 2.82 | 2.82 |
0.9 | 2.99 | 2.84 | 2.78 | 2.76 | 2.77 | 2.76 | 2.76 | 2.76 | 2.77 | 2.78 | 2.79 | 2.79 | 2.8 | 2.8 |
1.0 | 3.12 | 2.89 | 2.8 | 2.77 | 2.77 | 2.75 | 2.74 | 2.74 | 2.76 | 2.77 | 2.78 | 2.79 | 2.79 | 2.79 |
1.1 | 3.32 | 3.0 | 2.82 | 2.8 | 2.77 | 2.74 | 2.73 | 2.74 | 2.73 | 2.74 | 2.77 | 2.78 | 2.78 | 2.78 |
1.2 | 3.58 | 3.13 | 2.9 | 2.85 | 2.8 | 2.77 | 2.74 | 2.74 | 2.73 | 2.74 | 2.73 | 2.76 | 2.77 | 2.77 |
1.3 | 3.87 | 3.3 | 3.0 | 2.92 | 2.87 | 2.79 | 2.76 | 2.77 | 2.75 | 2.74 | 2.74 | 2.74 | 2.75 | 2.76 |
1.5 | 4.45 | 3.78 | 3.28 | 3.17 | 3.03 | 2.88 | 2.85 | 2.82 | 2.78 | 2.77 | 2.74 | 2.73 | 2.74 | 2.73 |
1.7 | 4.84 | 4.24 | 3.76 | 3.54 | 3.31 | 3.06 | 2.99 | 2.93 | 2.88 | 2.84 | 2.8 | 2.77 | 2.75 | 2.75 |
1.9 | 5.11 | 4.65 | 4.15 | 3.95 | 3.73 | 3.33 | 3.2 | 3.12 | 3.03 | 2.98 | 2.88 | 2.84 | 2.8 | 2.79 |
2.0 | 5.19 | 4.81 | 4.37 | 4.11 | 3.92 | 3.54 | 3.34 | 3.23 | 3.13 | 3.05 | 2.95 | 2.88 | 2.83 | 2.81 |
Result of batch_size > 1 is a little than that of batch_size == 1, with 2.74 > 2.73. And the lowest wer is obtained with different lm_scale/decoder_scale setting.
Detail results: decoder_scale(right)lm_scale(below) | 0.1 | 0.3 | 0.5 | 0.6 | 0.7 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.5 | 1.7 | 1.9 | 2.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1 | 2.99 | 2.98 | 2.94 | 2.92 | 2.92 | 2.92 | 2.91 | 2.91 | 2.9 | 2.9 | 2.89 | 2.89 | 2.89 | 2.89 |
0.3 | 2.9 | 2.9 | 2.9 | 2.9 | 2.9 | 2.89 | 2.88 | 2.87 | 2.88 | 2.88 | 2.86 | 2.86 | 2.86 | 2.86 |
0.5 | 2.88 | 2.85 | 2.85 | 2.87 | 2.86 | 2.85 | 2.85 | 2.86 | 2.85 | 2.85 | 2.85 | 2.85 | 2.86 | 2.86 |
0.6 | 2.86 | 2.83 | 2.82 | 2.82 | 2.84 | 2.84 | 2.84 | 2.84 | 2.85 | 2.85 | 2.85 | 2.85 | 2.86 | 2.86 |
0.7 | 2.86 | 2.81 | 2.79 | 2.8 | 2.81 | 2.83 | 2.83 | 2.83 | 2.84 | 2.84 | 2.85 | 2.86 | 2.85 | 2.85 |
0.9 | 2.98 | 2.84 | 2.79 | 2.76 | 2.77 | 2.78 | 2.78 | 2.8 | 2.81 | 2.82 | 2.82 | 2.82 | 2.83 | 2.84 |
1.0 | 3.12 | 2.88 | 2.81 | 2.79 | 2.77 | 2.76 | 2.76 | 2.78 | 2.79 | 2.81 | 2.82 | 2.81 | 2.82 | 2.82 |
1.1 | 3.31 | 3.0 | 2.83 | 2.81 | 2.79 | 2.76 | 2.75 | 2.75 | 2.75 | 2.77 | 2.8 | 2.8 | 2.81 | 2.81 |
1.2 | 3.59 | 3.13 | 2.9 | 2.85 | 2.81 | 2.79 | 2.77 | 2.76 | 2.75 | 2.76 | 2.76 | 2.79 | 2.8 | 2.8 |
1.3 | 3.87 | 3.3 | 3.01 | 2.93 | 2.87 | 2.79 | 2.78 | 2.79 | 2.77 | 2.76 | 2.76 | 2.77 | 2.78 | 2.79 |
1.5 | 4.43 | 3.81 | 3.29 | 3.17 | 3.05 | 2.9 | 2.87 | 2.84 | 2.8 | 2.78 | 2.77 | 2.74 | 2.75 | 2.75 |
1.7 | 4.86 | 4.28 | 3.79 | 3.56 | 3.32 | 3.07 | 3.0 | 2.95 | 2.89 | 2.87 | 2.82 | 2.79 | 2.78 | 2.77 |
1.9 | 5.15 | 4.68 | 4.17 | 3.96 | 3.74 | 3.33 | 3.21 | 3.13 | 3.04 | 2.99 | 2.88 | 2.85 | 2.82 | 2.81 |
2.0 | 5.22 | 4.83 | 4.37 | 4.13 | 3.92 | 3.55 | 3.34 | 3.24 | 3.14 | 3.07 | 2.95 | 2.87 | 2.84 | 2.82 |
As suggested by fangjun, the crash when decoding test-other is solved by batch_size > 1. Current results are: | Wer% on test_clean | wer% on test_other |
---|---|---|
Encoder + ctc | 3.32 | 7.96 |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore | 2.92 | *(to be tested) |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 | 2.87 | *(to be tested) |
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 | 2.86 | *(to be tested) |
+log_semering=False and remove repeated tokens | 2.73 | 6.11 |
Fantastic! I don't think those small differences in WER are significant, likely just noise.
On Tue, Jul 13, 2021 at 8:04 PM LIyong.Guo @.***> wrote:
As suggested by fangjun, the crash when decode test-other is solved by batch_size > 1. Current results are: Wer% on test_clean wer% on test_other Encoder + ctc 3.32 7.96 Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 (to be tested) Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 2.87 (to be tested) Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.86 *(to be tested) +log_semering=False and remove repeated tokens 2.73 6.11
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/pull/227#issuecomment-879028958, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYPJOUD2HUMG6JAHTDTXQTTFANCNFSM47Z5W3HQ .
A better model is obained with following modifications:
feat-norm | learning-factor | warm-up steps | epoch | |
---|---|---|---|---|
before | no | 10 | 40,000 | 40 epoch (avg=10, with 26-35 epoch) |
current | yes | 5 | 80,000(around 10 epochs) | 50 epochs (avg=20 with 31-50 epochs) |
detail wer on test-clean:
before | current | |
---|---|---|
Encoder + ctc | 3.32 | 2.98( wer of espnet released model is 2.97/3.00) |
Encoder + TLG + 4-gram lattice rescore + nbest rescore with transformer decoder with log_semering=False and remove repeated tokens | 2.73 | 2.54 |
result with diffrernt combination of decoder_scale and lm_scale wer=2.54 is obtained with decoder_scale = 1.7 and lm_scale=1.7
decoder_scale(right)lm_scale(below) | 0.1 | 0.3 | 0.5 | 0.6 | 0.7 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.5 | 1.7 | 1.9 | 2.0 | 2.1 | 2.2 | 2.3 | 2.4 | 2.5 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1 | 2.81 | 2.78 | 2.75 | 2.75 | 2.74 | 2.74 | 2.73 | 2.73 | 2.73 | 2.72 | 2.72 | 2.71 | 2.71 | 2.7 | 2.69 | 2.7 | 2.7 | 2.7 | 2.7 |
0.3 | 2.75 | 2.72 | 2.7 | 2.69 | 2.68 | 2.68 | 2.69 | 2.69 | 2.69 | 2.68 | 2.68 | 2.67 | 2.67 | 2.66 | 2.66 | 2.67 | 2.66 | 2.66 | 2.66 |
0.5 | 2.7 | 2.66 | 2.67 | 2.66 | 2.67 | 2.66 | 2.65 | 2.66 | 2.66 | 2.65 | 2.64 | 2.64 | 2.64 | 2.63 | 2.63 | 2.63 | 2.63 | 2.63 | 2.63 |
0.6 | 2.68 | 2.66 | 2.64 | 2.64 | 2.63 | 2.65 | 2.65 | 2.65 | 2.65 | 2.64 | 2.63 | 2.63 | 2.63 | 2.62 | 2.61 | 2.61 | 2.61 | 2.62 | 2.62 |
0.7 | 2.67 | 2.63 | 2.62 | 2.63 | 2.62 | 2.63 | 2.64 | 2.63 | 2.64 | 2.64 | 2.64 | 2.62 | 2.61 | 2.61 | 2.61 | 2.61 | 2.61 | 2.62 | 2.61 |
0.9 | 2.73 | 2.61 | 2.6 | 2.6 | 2.61 | 2.61 | 2.62 | 2.61 | 2.61 | 2.61 | 2.61 | 2.6 | 2.61 | 2.61 | 2.62 | 2.61 | 2.61 | 2.61 | 2.61 |
1.0 | 2.85 | 2.65 | 2.59 | 2.59 | 2.6 | 2.6 | 2.59 | 2.6 | 2.59 | 2.59 | 2.59 | 2.61 | 2.6 | 2.61 | 2.61 | 2.61 | 2.61 | 2.61 | 2.61 |
1.1 | 3.04 | 2.71 | 2.62 | 2.59 | 2.59 | 2.6 | 2.6 | 2.6 | 2.58 | 2.59 | 2.59 | 2.59 | 2.59 | 2.59 | 2.6 | 2.6 | 2.6 | 2.61 | 2.6 |
1.2 | 3.31 | 2.86 | 2.65 | 2.62 | 2.59 | 2.58 | 2.57 | 2.58 | 2.58 | 2.58 | 2.58 | 2.59 | 2.59 | 2.59 | 2.58 | 2.58 | 2.59 | 2.6 | 2.6 |
1.3 | 3.52 | 3.04 | 2.75 | 2.66 | 2.62 | 2.57 | 2.57 | 2.56 | 2.56 | 2.57 | 2.57 | 2.58 | 2.59 | 2.59 | 2.59 | 2.59 | 2.58 | 2.58 | 2.58 |
1.5 | 4.0 | 3.47 | 3.06 | 2.89 | 2.8 | 2.64 | 2.6 | 2.58 | 2.59 | 2.56 | 2.56 | 2.55 | 2.56 | 2.56 | 2.57 | 2.58 | 2.58 | 2.58 | 2.59 |
1.7 | 4.41 | 3.87 | 3.43 | 3.26 | 3.07 | 2.83 | 2.74 | 2.67 | 2.64 | 2.6 | 2.58 | 2.54 | 2.56 | 2.55 | 2.55 | 2.55 | 2.55 | 2.57 | 2.57 |
1.9 | 4.64 | 4.26 | 3.8 | 3.61 | 3.41 | 3.12 | 2.99 | 2.86 | 2.79 | 2.73 | 2.64 | 2.57 | 2.56 | 2.54 | 2.56 | 2.56 | 2.55 | 2.55 | 2.55 |
2.0 | 4.72 | 4.38 | 3.98 | 3.77 | 3.59 | 3.29 | 3.13 | 3.01 | 2.88 | 2.81 | 2.68 | 2.62 | 2.57 | 2.56 | 2.56 | 2.55 | 2.56 | 2.56 | 2.55 |
Great!!
Hi glynpu: This is a very cool work, is there a recipe to reproduce your results? Thanks! @glynpu
This is a very cool work, is there a recipe to reproduce your results?
Current pr is mainly about decoding part. And #219 is about corresponding training part. Follow egs/librispeech/asr/simple_v1/bpe_run.sh in #219 and run stage0 and stage 1 you will reproduce my work. @Alex-Songs
thanks! @glynpu
Result witout feature_batch_norm
Wer result on test_clean:![38af9a2a0505f616a1fb9eaa7817c1a](https://user-images.githubusercontent.com/14951566/124417143-148e6100-dd8b-11eb-8fbf-2a8d49eaca93.png)