k2-fsa / snowfall

Moved to https://github.com/k2-fsa/icefall
Apache License 2.0
143 stars 42 forks source link

[WIP] update bpe models and integrate 4-gram rescore #227

Open glynpu opened 3 years ago

glynpu commented 3 years ago
  1. A better model trained by (ctc + label_smooth_loss #219) is released
  2. 4-gram rescore is integrated with refering to #215
Latest result with feat_batch_norm   Wer% on test_clean wer% on test_other
Encoder + ctc 2.98 (to be tested)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.54 (to be tested)

Result witout feature_batch_norm

  Wer% on test_clean wer% on test_other
Encoder + ctc 3.32 7.96
Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 *(failed when decoding, working on this)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 2.87 *(to be tested)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.86 *(to be tested)
+log_semering=False and remove repeated tokens 2.73 6.11

Wer result on test_clean: 38af9a2a0505f616a1fb9eaa7817c1a

glynpu commented 3 years ago

Here is the log when program crash while decoding test-other:

INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%)
INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%)
INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%)
[F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in
t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4

[ Stack-Trace: ]
/ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187]
python(PyCFunction_Call+0x56) [0x5ff8a6]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
/ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7
8d]
python(PyCFunction_Call+0xfb) [0x5ff94b]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python() [0x662c2e]
python(PyRun_FileExFlags+0x97) [0x662d07]
python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f]

Traceback (most recent call last):
  File "bpe_ctc_att_conformer_decode.py", line 617, in <module>
  File "bpe_ctc_att_conformer_decode.py", line 576, in main
    model=model,
  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "bpe_ctc_att_conformer_decode.py", line 278, in decode
    model=model,
  File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch
    lm_scale_list)
  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w
ith_whole_lattice
    best_paths = k2.shortest_path(inv_lats, use_double_scores=True)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path
    out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor
    setattr(dest, name, index_select(value, arc_map,
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select
    ans = _IndexSelectFunction.apply(src, index, default_value)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward
    return _k2.index_select(src, index, default_value)
RuntimeError: Some bad things happed.
csukuangfj commented 3 years ago

Here is the log when program crash while decoding test-other:


INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%)

INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%)

INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%)

[F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in

t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4

[ Stack-Trace: ]

/ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14]

/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187]

python(PyCFunction_Call+0x56) [0x5ff8a6]

python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]

python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]

python(_PyFunction_Vectorcall+0x19c) [0x602b2c]

python(PyVectorcall_Call+0x51) [0x5ff3b1]

/ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7

8d]

python(PyCFunction_Call+0xfb) [0x5ff94b]

python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]

python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x442) [0x602dd2]

python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]

python(_PyFunction_Vectorcall+0x19c) [0x602b2c]

python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x442) [0x602dd2]

python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]

python(_PyFunction_Vectorcall+0x19c) [0x602b2c]

python(PyVectorcall_Call+0x51) [0x5ff3b1]

python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x247) [0x602bd7]

python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x442) [0x602dd2]

python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x442) [0x602dd2]

python(PyVectorcall_Call+0x51) [0x5ff3b1]

python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x442) [0x602dd2]

python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python(_PyFunction_Vectorcall+0x247) [0x602bd7]

python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]

python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]

python() [0x662c2e]

python(PyRun_FileExFlags+0x97) [0x662d07]

python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f]

Traceback (most recent call last):

  File "bpe_ctc_att_conformer_decode.py", line 617, in <module>

  File "bpe_ctc_att_conformer_decode.py", line 576, in main

    model=model,

  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

    return func(*args, **kwargs)

  File "bpe_ctc_att_conformer_decode.py", line 278, in decode

    model=model,

  File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch

    lm_scale_list)

  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

    return func(*args, **kwargs)

  File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w

ith_whole_lattice

    best_paths = k2.shortest_path(inv_lats, use_double_scores=True)

  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path

    out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map)

  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor

    setattr(dest, name, index_select(value, arc_map,

  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select

    ans = _IndexSelectFunction.apply(src, index, default_value)

  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward

    return _k2.index_select(src, index, default_value)

RuntimeError: Some bad things happed.

Will have a look. Probably tomorrow.

danpovey commented 3 years ago

Can you find the code where it gets 'index' from? Possibly we failed to do clone() at some point to make it a stride-1 tensor if it came from an FSA (but it's still very odd). You may be able to replicate the failure in pdb and debug it that way (let me know by wechat if when run in pdb shows an error, because I may be able to remember the fix).

danpovey commented 3 years ago

The line numbers in utils.py don't seem to match with the current master.

glynpu commented 3 years ago

Result of n-best rescore with transformer decoder:

  Wer% on test_clean wer% on test_other
Encoder + ctc 3.32 7.96
Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 *(failed when decoding, working on this)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 2.87 *(to be tested)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.86 *(to be tested)

Detail errors

num-paths-for-decoder-rescore=100
INFO:root:[test-clean-lm_scale_0.6] %WER 2.87% [1510 / 52576, 207 ins, 130 del, 1173 sub ]
num-paths-for-decoder-rescore=500
INFO:root:[test-clean-lm_scale_0.6] %WER 2.86% [1505 / 52576, 207 ins, 128 del, 1170 sub ]
danpovey commented 3 years ago

What is the LM scale? I would imagine that when using the transformer decoder, we'd need to scale down the LM probabilities, because that decoder would already account for the LM prob.

glynpu commented 3 years ago

What is the LM scale?

currently no scale. as:

    tot_scores = am_scores + fgram_lm_scores + decoder_scores

we'd need to scale down the LM probabilities, because that decoder would already account for the LM prob.

Do you mean assign a weight less than one to lm_scores? like this:

-   tot_scores = am_scores + fgram_lm_scores + decoder_scores
+   lm_score_weight = 0.6 # just a value less than one
+   decoder_score_weight = 0.7 # just a value less than one
+   tot_scores = am_scores + lm_score_weight * fgram_lm_scores + decoder_score_weight * decoder_scores
csukuangfj commented 3 years ago

Do you mean assign a weight less than one to lm_scores? like this:

  • tot_scores = am_scores + fgram_lm_scores + decoder_scores
  • lm_score_weight = 0.6 # just a value less than one
  • decoder_score_weight = 0.7 # just a value less than one
  • tot_scores = am_scores + lm_score_weight fgram_lm_scores + decoder_score_weight decoder_scores

I often see people using a combination of weights, whose sum is 1.

glynpu commented 3 years ago

compute am/4-gram lm_scores with _unique_tokenseqs seems a little better than that of _unique_wordseqs, with a variety of combination of lm_scale and decoder_scale.

  Wer% on test_clean wer% on test_other
Encoder + ctc 3.32 7.96
Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 *(failed when decoding, working on this)
+transformer decoder n-best rescore computing with _unique_wordseqs 2.87 *(to be tested)
+transformer decoder n-best rescore computing with _unique_tokenseqs 2.81 *(to be tested)
wer of test_clean with compute_am_flm_scores_1, computing with _unique_wordseqs. decoder_scale(right)lm_scale(below) 0.01 0.03 0.05 0.08 0.09 0.1 0.3 0.5 0.7 0.9 1.0 2.0 4.0 6.0 8.0 10.0
0.1 3.04 3.02 3.02 2.99 3.0 2.99 3.07 3.17 3.25 3.33 3.33 3.52 3.6 3.66 3.69 3.71
0.3 2.96 2.94 2.94 2.94 2.93 2.94 3.05 3.17 3.24 3.31 3.34 3.52 3.61 3.66 3.69 3.72
0.5 2.93 2.91 2.89 2.88 2.87 2.89 3.04 3.14 3.27 3.33 3.36 3.52 3.62 3.67 3.7 3.71
0.6 2.91 2.89 2.88 2.89 2.88 2.89 3.04 3.16 3.26 3.34 3.37 3.53 3.62 3.67 3.7 3.72
0.7 2.93 2.93 2.91 2.91 2.9 2.9 3.06 3.16 3.28 3.33 3.36 3.53 3.62 3.67 3.71 3.73
0.9 3.14 3.1 3.09 3.06 3.05 3.05 3.13 3.24 3.31 3.37 3.4 3.55 3.65 3.69 3.72 3.74
1.0 3.33 3.28 3.25 3.2 3.21 3.21 3.21 3.29 3.37 3.4 3.43 3.59 3.67 3.7 3.74 3.74
2.0 5.63 5.56 5.53 5.47 5.45 5.43 5.06 4.68 4.39 4.18 4.11 3.82 3.8 3.8 3.81 3.8
4.0 6.13 6.11 6.1 6.1 6.09 6.08 5.97 5.84 5.69 5.56 5.49 4.75 4.06 3.92 3.87 3.86
6.0 6.23 6.22 6.22 6.2 6.21 6.21 6.15 6.08 5.99 5.91 5.89 5.44 4.65 4.14 3.96 3.92
8.0 6.3 6.3 6.28 6.28 6.28 6.27 6.23 6.19 6.13 6.09 6.04 5.79 5.08 4.61 4.22 4.02
10.0 6.32 6.32 6.31 6.31 6.31 6.31 6.27 6.24 6.2 6.16 6.14 5.93 5.42 4.9 4.58 4.27
wer of test_clean with compute_am_flm_scores2,computing with unique_token_seqs_. decoder_scale(right)lm_scale(below) 0.01 0.03 0.05 0.08 0.09 0.1 0.3 0.5 0.7 0.9 1.0 2.0 4.0 6.0 8.0 10.0
0.1 3.02 3.0 2.98 2.95 2.94 2.94 2.9 2.87 2.88 2.89 2.88 2.89 2.91 2.92 2.93 2.94
0.3 2.97 2.95 2.93 2.91 2.9 2.9 2.85 2.86 2.86 2.85 2.86 2.89 2.9 2.93 2.93 2.94
0.5 2.92 2.92 2.91 2.88 2.88 2.88 2.85 2.82 2.83 2.85 2.85 2.88 2.91 2.93 2.94 2.94
0.6 2.92 2.89 2.9 2.88 2.86 2.86 2.84 2.83 2.83 2.84 2.85 2.88 2.92 2.93 2.94 2.94
0.7 2.94 2.93 2.93 2.9 2.9 2.89 2.82 2.82 2.83 2.84 2.84 2.89 2.92 2.93 2.94 2.94
0.9 3.14 3.11 3.07 3.01 3.0 2.99 2.88 2.82 2.81 2.82 2.84 2.89 2.93 2.94 2.94 2.94
1.0 3.3 3.25 3.19 3.14 3.12 3.11 2.91 2.85 2.83 2.82 2.83 2.89 2.93 2.94 2.94 2.95
2.0 5.53 5.48 5.45 5.38 5.35 5.33 4.7 4.11 3.72 3.5 3.39 2.97 2.93 2.94 2.94 2.95
4.0 6.09 6.08 6.06 6.05 6.05 6.04 5.86 5.6 5.29 4.98 4.85 3.95 3.14 2.98 2.94 2.95
6.0 6.19 6.19 6.19 6.17 6.18 6.17 6.08 5.94 5.79 5.61 5.5 4.67 3.76 3.25 3.02 2.98
8.0 6.25 6.25 6.25 6.25 6.24 6.23 6.16 6.08 5.97 5.87 5.81 5.18 4.18 3.67 3.3 3.09
10.0 6.28 6.28 6.27 6.27 6.27 6.26 6.21 6.15 6.08 5.99 5.94 5.48 4.54 3.99 3.63 3.36

log of compute_am_flm_scores_1:

lm_scale_0.5_decoder_scale_0.09 2.87    best for test-clean
lm_scale_0.5_decoder_scale_0.08 2.88
lm_scale_0.6_decoder_scale_0.05 2.88
lm_scale_0.6_decoder_scale_0.09 2.88
lm_scale_0.5_decoder_scale_0.1  2.89
lm_scale_0.5_decoder_scale_0.05 2.89
lm_scale_0.6_decoder_scale_0.1  2.89
lm_scale_0.6_decoder_scale_0.03 2.89
lm_scale_0.6_decoder_scale_0.08 2.89
lm_scale_0.7_decoder_scale_0.1  2.9
lm_scale_0.7_decoder_scale_0.09 2.9
lm_scale_0.5_decoder_scale_0.03 2.91
lm_scale_0.6_decoder_scale_0.01 2.91
lm_scale_0.7_decoder_scale_0.05 2.91
lm_scale_0.7_decoder_scale_0.08 2.91
lm_scale_0.3_decoder_scale_0.09 2.93
lm_scale_0.5_decoder_scale_0.01 2.93
lm_scale_0.7_decoder_scale_0.01 2.93
lm_scale_0.7_decoder_scale_0.03 2.93
lm_scale_0.3_decoder_scale_0.1  2.94
lm_scale_0.3_decoder_scale_0.03 2.94
lm_scale_0.3_decoder_scale_0.05 2.94
lm_scale_0.3_decoder_scale_0.08 2.94
lm_scale_0.3_decoder_scale_0.01 2.96
lm_scale_0.1_decoder_scale_0.1  2.99
lm_scale_0.1_decoder_scale_0.08 2.99
lm_scale_0.1_decoder_scale_0.09 3.0
lm_scale_0.1_decoder_scale_0.03 3.02
lm_scale_0.1_decoder_scale_0.05 3.02
lm_scale_0.1_decoder_scale_0.01 3.04
lm_scale_0.5_decoder_scale_0.3  3.04
lm_scale_0.6_decoder_scale_0.3  3.04
lm_scale_0.3_decoder_scale_0.3  3.05
lm_scale_0.9_decoder_scale_0.1  3.05
lm_scale_0.9_decoder_scale_0.09 3.05
lm_scale_0.7_decoder_scale_0.3  3.06
lm_scale_0.9_decoder_scale_0.08 3.06
lm_scale_0.1_decoder_scale_0.3  3.07
lm_scale_0.9_decoder_scale_0.05 3.09
lm_scale_0.9_decoder_scale_0.03 3.1
lm_scale_0.9_decoder_scale_0.3  3.13
lm_scale_0.5_decoder_scale_0.5  3.14
lm_scale_0.9_decoder_scale_0.01 3.14
lm_scale_0.6_decoder_scale_0.5  3.16
lm_scale_0.7_decoder_scale_0.5  3.16
lm_scale_0.1_decoder_scale_0.5  3.17
lm_scale_0.3_decoder_scale_0.5  3.17
lm_scale_1.0_decoder_scale_0.08 3.2
lm_scale_1.0_decoder_scale_0.1  3.21
lm_scale_1.0_decoder_scale_0.3  3.21
lm_scale_1.0_decoder_scale_0.09 3.21
lm_scale_0.3_decoder_scale_0.7  3.24
lm_scale_0.9_decoder_scale_0.5  3.24
lm_scale_0.1_decoder_scale_0.7  3.25
lm_scale_1.0_decoder_scale_0.05 3.25
lm_scale_0.6_decoder_scale_0.7  3.26
lm_scale_0.5_decoder_scale_0.7  3.27
lm_scale_0.7_decoder_scale_0.7  3.28
lm_scale_1.0_decoder_scale_0.03 3.28
lm_scale_1.0_decoder_scale_0.5  3.29
lm_scale_0.3_decoder_scale_0.9  3.31
lm_scale_0.9_decoder_scale_0.7  3.31
lm_scale_0.1_decoder_scale_0.9  3.33
lm_scale_0.1_decoder_scale_1.0  3.33
lm_scale_0.5_decoder_scale_0.9  3.33
lm_scale_0.7_decoder_scale_0.9  3.33
lm_scale_1.0_decoder_scale_0.01 3.33
lm_scale_0.3_decoder_scale_1.0  3.34
lm_scale_0.6_decoder_scale_0.9  3.34
lm_scale_0.5_decoder_scale_1.0  3.36
lm_scale_0.7_decoder_scale_1.0  3.36
lm_scale_0.6_decoder_scale_1.0  3.37
lm_scale_0.9_decoder_scale_0.9  3.37
lm_scale_1.0_decoder_scale_0.7  3.37
lm_scale_0.9_decoder_scale_1.0  3.4
lm_scale_1.0_decoder_scale_0.9  3.4
lm_scale_1.0_decoder_scale_1.0  3.43
lm_scale_0.1_decoder_scale_2.0  3.52
lm_scale_0.3_decoder_scale_2.0  3.52
lm_scale_0.5_decoder_scale_2.0  3.52
lm_scale_0.6_decoder_scale_2.0  3.53
lm_scale_0.7_decoder_scale_2.0  3.53
lm_scale_0.9_decoder_scale_2.0  3.55
lm_scale_1.0_decoder_scale_2.0  3.59
lm_scale_0.1_decoder_scale_4.0  3.6
lm_scale_0.3_decoder_scale_4.0  3.61
lm_scale_0.5_decoder_scale_4.0  3.62
lm_scale_0.6_decoder_scale_4.0  3.62
lm_scale_0.7_decoder_scale_4.0  3.62
lm_scale_0.9_decoder_scale_4.0  3.65
lm_scale_0.1_decoder_scale_6.0  3.66
lm_scale_0.3_decoder_scale_6.0  3.66
lm_scale_0.5_decoder_scale_6.0  3.67
lm_scale_0.6_decoder_scale_6.0  3.67
lm_scale_0.7_decoder_scale_6.0  3.67
lm_scale_1.0_decoder_scale_4.0  3.67
lm_scale_0.1_decoder_scale_8.0  3.69
lm_scale_0.3_decoder_scale_8.0  3.69
lm_scale_0.9_decoder_scale_6.0  3.69
lm_scale_0.5_decoder_scale_8.0  3.7
lm_scale_0.6_decoder_scale_8.0  3.7
lm_scale_1.0_decoder_scale_6.0  3.7
lm_scale_0.1_decoder_scale_10.0 3.71
lm_scale_0.5_decoder_scale_10.0 3.71
lm_scale_0.7_decoder_scale_8.0  3.71
lm_scale_0.3_decoder_scale_10.0 3.72
lm_scale_0.6_decoder_scale_10.0 3.72
lm_scale_0.9_decoder_scale_8.0  3.72
lm_scale_0.7_decoder_scale_10.0 3.73
lm_scale_0.9_decoder_scale_10.0 3.74
lm_scale_1.0_decoder_scale_8.0  3.74
lm_scale_1.0_decoder_scale_10.0 3.74
lm_scale_2.0_decoder_scale_4.0  3.8
lm_scale_2.0_decoder_scale_6.0  3.8
lm_scale_2.0_decoder_scale_10.0 3.8
lm_scale_2.0_decoder_scale_8.0  3.81
lm_scale_2.0_decoder_scale_2.0  3.82
lm_scale_4.0_decoder_scale_10.0 3.86
lm_scale_4.0_decoder_scale_8.0  3.87
lm_scale_4.0_decoder_scale_6.0  3.92
lm_scale_6.0_decoder_scale_10.0 3.92
lm_scale_6.0_decoder_scale_8.0  3.96
lm_scale_8.0_decoder_scale_10.0 4.02
lm_scale_4.0_decoder_scale_4.0  4.06
lm_scale_2.0_decoder_scale_1.0  4.11
lm_scale_6.0_decoder_scale_6.0  4.14
lm_scale_2.0_decoder_scale_0.9  4.18
lm_scale_8.0_decoder_scale_8.0  4.22
lm_scale_10.0_decoder_scale_10.0    4.27
lm_scale_2.0_decoder_scale_0.7  4.39
lm_scale_10.0_decoder_scale_8.0 4.58
lm_scale_8.0_decoder_scale_6.0  4.61
lm_scale_6.0_decoder_scale_4.0  4.65
lm_scale_2.0_decoder_scale_0.5  4.68
lm_scale_4.0_decoder_scale_2.0  4.75
lm_scale_10.0_decoder_scale_6.0 4.9
lm_scale_2.0_decoder_scale_0.3  5.06
lm_scale_8.0_decoder_scale_4.0  5.08
lm_scale_10.0_decoder_scale_4.0 5.42
lm_scale_2.0_decoder_scale_0.1  5.43
lm_scale_6.0_decoder_scale_2.0  5.44
lm_scale_2.0_decoder_scale_0.09 5.45
lm_scale_2.0_decoder_scale_0.08 5.47
lm_scale_4.0_decoder_scale_1.0  5.49
lm_scale_2.0_decoder_scale_0.05 5.53
lm_scale_2.0_decoder_scale_0.03 5.56
lm_scale_4.0_decoder_scale_0.9  5.56
lm_scale_2.0_decoder_scale_0.01 5.63
lm_scale_4.0_decoder_scale_0.7  5.69
lm_scale_8.0_decoder_scale_2.0  5.79
lm_scale_4.0_decoder_scale_0.5  5.84
lm_scale_6.0_decoder_scale_1.0  5.89
lm_scale_6.0_decoder_scale_0.9  5.91
lm_scale_10.0_decoder_scale_2.0 5.93
lm_scale_4.0_decoder_scale_0.3  5.97
lm_scale_6.0_decoder_scale_0.7  5.99
lm_scale_8.0_decoder_scale_1.0  6.04
lm_scale_4.0_decoder_scale_0.1  6.08
lm_scale_6.0_decoder_scale_0.5  6.08
lm_scale_4.0_decoder_scale_0.09 6.09
lm_scale_8.0_decoder_scale_0.9  6.09
lm_scale_4.0_decoder_scale_0.05 6.1
lm_scale_4.0_decoder_scale_0.08 6.1
lm_scale_4.0_decoder_scale_0.03 6.11
lm_scale_4.0_decoder_scale_0.01 6.13
lm_scale_8.0_decoder_scale_0.7  6.13
lm_scale_10.0_decoder_scale_1.0 6.14
lm_scale_6.0_decoder_scale_0.3  6.15
lm_scale_10.0_decoder_scale_0.9 6.16
lm_scale_8.0_decoder_scale_0.5  6.19
lm_scale_6.0_decoder_scale_0.08 6.2
lm_scale_10.0_decoder_scale_0.7 6.2
lm_scale_6.0_decoder_scale_0.1  6.21
lm_scale_6.0_decoder_scale_0.09 6.21
lm_scale_6.0_decoder_scale_0.03 6.22
lm_scale_6.0_decoder_scale_0.05 6.22
lm_scale_6.0_decoder_scale_0.01 6.23
lm_scale_8.0_decoder_scale_0.3  6.23
lm_scale_10.0_decoder_scale_0.5 6.24
lm_scale_8.0_decoder_scale_0.1  6.27
lm_scale_10.0_decoder_scale_0.3 6.27
lm_scale_8.0_decoder_scale_0.05 6.28
lm_scale_8.0_decoder_scale_0.08 6.28
lm_scale_8.0_decoder_scale_0.09 6.28
lm_scale_8.0_decoder_scale_0.01 6.3
lm_scale_8.0_decoder_scale_0.03 6.3
lm_scale_10.0_decoder_scale_0.1 6.31
lm_scale_10.0_decoder_scale_0.05    6.31
lm_scale_10.0_decoder_scale_0.08    6.31
lm_scale_10.0_decoder_scale_0.09    6.31
lm_scale_10.0_decoder_scale_0.01    6.32
lm_scale_10.0_decoder_scale_0.03    6.32

log of compute_am_flm_scores_2

lm_scale_0.9_decoder_scale_0.7  2.81    best for test-clean
lm_scale_0.5_decoder_scale_0.5  2.82
lm_scale_0.7_decoder_scale_0.3  2.82
lm_scale_0.7_decoder_scale_0.5  2.82
lm_scale_0.9_decoder_scale_0.5  2.82
lm_scale_0.9_decoder_scale_0.9  2.82
lm_scale_1.0_decoder_scale_0.9  2.82
lm_scale_0.5_decoder_scale_0.7  2.83
lm_scale_0.6_decoder_scale_0.5  2.83
lm_scale_0.6_decoder_scale_0.7  2.83
lm_scale_0.7_decoder_scale_0.7  2.83
lm_scale_1.0_decoder_scale_0.7  2.83
lm_scale_1.0_decoder_scale_1.0  2.83
lm_scale_0.6_decoder_scale_0.3  2.84
lm_scale_0.6_decoder_scale_0.9  2.84
lm_scale_0.7_decoder_scale_0.9  2.84
lm_scale_0.7_decoder_scale_1.0  2.84
lm_scale_0.9_decoder_scale_1.0  2.84
lm_scale_0.3_decoder_scale_0.3  2.85
lm_scale_0.3_decoder_scale_0.9  2.85
lm_scale_0.5_decoder_scale_0.3  2.85
lm_scale_0.5_decoder_scale_0.9  2.85
lm_scale_0.5_decoder_scale_1.0  2.85
lm_scale_0.6_decoder_scale_1.0  2.85
lm_scale_1.0_decoder_scale_0.5  2.85
lm_scale_0.3_decoder_scale_0.5  2.86
lm_scale_0.3_decoder_scale_0.7  2.86
lm_scale_0.3_decoder_scale_1.0  2.86
lm_scale_0.6_decoder_scale_0.1  2.86
lm_scale_0.6_decoder_scale_0.09 2.86
lm_scale_0.1_decoder_scale_0.5  2.87
lm_scale_0.1_decoder_scale_0.7  2.88
lm_scale_0.1_decoder_scale_1.0  2.88
lm_scale_0.5_decoder_scale_0.1  2.88
lm_scale_0.5_decoder_scale_2.0  2.88
lm_scale_0.5_decoder_scale_0.08 2.88
lm_scale_0.5_decoder_scale_0.09 2.88
lm_scale_0.6_decoder_scale_2.0  2.88
lm_scale_0.6_decoder_scale_0.08 2.88
lm_scale_0.9_decoder_scale_0.3  2.88
lm_scale_0.1_decoder_scale_0.9  2.89
lm_scale_0.1_decoder_scale_2.0  2.89
lm_scale_0.3_decoder_scale_2.0  2.89
lm_scale_0.6_decoder_scale_0.03 2.89
lm_scale_0.7_decoder_scale_0.1  2.89
lm_scale_0.7_decoder_scale_2.0  2.89
lm_scale_0.9_decoder_scale_2.0  2.89
lm_scale_1.0_decoder_scale_2.0  2.89
lm_scale_0.1_decoder_scale_0.3  2.9
lm_scale_0.3_decoder_scale_0.1  2.9
lm_scale_0.3_decoder_scale_4.0  2.9
lm_scale_0.3_decoder_scale_0.09 2.9
lm_scale_0.6_decoder_scale_0.05 2.9
lm_scale_0.7_decoder_scale_0.08 2.9
lm_scale_0.7_decoder_scale_0.09 2.9
lm_scale_0.1_decoder_scale_4.0  2.91
lm_scale_0.3_decoder_scale_0.08 2.91
lm_scale_0.5_decoder_scale_4.0  2.91
lm_scale_0.5_decoder_scale_0.05 2.91
lm_scale_1.0_decoder_scale_0.3  2.91
lm_scale_0.1_decoder_scale_6.0  2.92
lm_scale_0.5_decoder_scale_0.01 2.92
lm_scale_0.5_decoder_scale_0.03 2.92
lm_scale_0.6_decoder_scale_4.0  2.92
lm_scale_0.6_decoder_scale_0.01 2.92
lm_scale_0.7_decoder_scale_4.0  2.92
lm_scale_0.1_decoder_scale_8.0  2.93
lm_scale_0.3_decoder_scale_6.0  2.93
lm_scale_0.3_decoder_scale_8.0  2.93
lm_scale_0.3_decoder_scale_0.05 2.93
lm_scale_0.5_decoder_scale_6.0  2.93
lm_scale_0.6_decoder_scale_6.0  2.93
lm_scale_0.7_decoder_scale_6.0  2.93
lm_scale_0.7_decoder_scale_0.03 2.93
lm_scale_0.7_decoder_scale_0.05 2.93
lm_scale_0.9_decoder_scale_4.0  2.93
lm_scale_1.0_decoder_scale_4.0  2.93
lm_scale_2.0_decoder_scale_4.0  2.93
lm_scale_0.1_decoder_scale_0.1  2.94
lm_scale_0.1_decoder_scale_10.0 2.94
lm_scale_0.1_decoder_scale_0.09 2.94
lm_scale_0.3_decoder_scale_10.0 2.94
lm_scale_0.5_decoder_scale_8.0  2.94
lm_scale_0.5_decoder_scale_10.0 2.94
lm_scale_0.6_decoder_scale_8.0  2.94
lm_scale_0.6_decoder_scale_10.0 2.94
lm_scale_0.7_decoder_scale_8.0  2.94
lm_scale_0.7_decoder_scale_10.0 2.94
lm_scale_0.7_decoder_scale_0.01 2.94
lm_scale_0.9_decoder_scale_6.0  2.94
lm_scale_0.9_decoder_scale_8.0  2.94
lm_scale_0.9_decoder_scale_10.0 2.94
lm_scale_1.0_decoder_scale_6.0  2.94
lm_scale_1.0_decoder_scale_8.0  2.94
lm_scale_2.0_decoder_scale_6.0  2.94
lm_scale_2.0_decoder_scale_8.0  2.94
lm_scale_4.0_decoder_scale_8.0  2.94
lm_scale_0.1_decoder_scale_0.08 2.95
lm_scale_0.3_decoder_scale_0.03 2.95
lm_scale_1.0_decoder_scale_10.0 2.95
lm_scale_2.0_decoder_scale_10.0 2.95
lm_scale_4.0_decoder_scale_10.0 2.95
lm_scale_0.3_decoder_scale_0.01 2.97
lm_scale_2.0_decoder_scale_2.0  2.97
lm_scale_0.1_decoder_scale_0.05 2.98
lm_scale_4.0_decoder_scale_6.0  2.98
lm_scale_6.0_decoder_scale_10.0 2.98
lm_scale_0.9_decoder_scale_0.1  2.99
lm_scale_0.1_decoder_scale_0.03 3.0
lm_scale_0.9_decoder_scale_0.09 3.0
lm_scale_0.9_decoder_scale_0.08 3.01
lm_scale_0.1_decoder_scale_0.01 3.02
lm_scale_6.0_decoder_scale_8.0  3.02
lm_scale_0.9_decoder_scale_0.05 3.07
lm_scale_8.0_decoder_scale_10.0 3.09
lm_scale_0.9_decoder_scale_0.03 3.11
lm_scale_1.0_decoder_scale_0.1  3.11
lm_scale_1.0_decoder_scale_0.09 3.12
lm_scale_0.9_decoder_scale_0.01 3.14
lm_scale_1.0_decoder_scale_0.08 3.14
lm_scale_4.0_decoder_scale_4.0  3.14
lm_scale_1.0_decoder_scale_0.05 3.19
lm_scale_1.0_decoder_scale_0.03 3.25
lm_scale_6.0_decoder_scale_6.0  3.25
lm_scale_1.0_decoder_scale_0.01 3.3
lm_scale_8.0_decoder_scale_8.0  3.3
lm_scale_10.0_decoder_scale_10.0    3.36
lm_scale_2.0_decoder_scale_1.0  3.39
lm_scale_2.0_decoder_scale_0.9  3.5
lm_scale_10.0_decoder_scale_8.0 3.63
lm_scale_8.0_decoder_scale_6.0  3.67
lm_scale_2.0_decoder_scale_0.7  3.72
lm_scale_6.0_decoder_scale_4.0  3.76
lm_scale_4.0_decoder_scale_2.0  3.95
lm_scale_10.0_decoder_scale_6.0 3.99
lm_scale_2.0_decoder_scale_0.5  4.11
lm_scale_8.0_decoder_scale_4.0  4.18
lm_scale_10.0_decoder_scale_4.0 4.54
lm_scale_6.0_decoder_scale_2.0  4.67
lm_scale_2.0_decoder_scale_0.3  4.7
lm_scale_4.0_decoder_scale_1.0  4.85
lm_scale_4.0_decoder_scale_0.9  4.98
lm_scale_8.0_decoder_scale_2.0  5.18
lm_scale_4.0_decoder_scale_0.7  5.29
lm_scale_2.0_decoder_scale_0.1  5.33
lm_scale_2.0_decoder_scale_0.09 5.35
lm_scale_2.0_decoder_scale_0.08 5.38
lm_scale_2.0_decoder_scale_0.05 5.45
lm_scale_2.0_decoder_scale_0.03 5.48
lm_scale_10.0_decoder_scale_2.0 5.48
lm_scale_6.0_decoder_scale_1.0  5.5
lm_scale_2.0_decoder_scale_0.01 5.53
lm_scale_4.0_decoder_scale_0.5  5.6
lm_scale_6.0_decoder_scale_0.9  5.61
lm_scale_6.0_decoder_scale_0.7  5.79
lm_scale_8.0_decoder_scale_1.0  5.81
lm_scale_4.0_decoder_scale_0.3  5.86
lm_scale_8.0_decoder_scale_0.9  5.87
lm_scale_6.0_decoder_scale_0.5  5.94
lm_scale_10.0_decoder_scale_1.0 5.94
lm_scale_8.0_decoder_scale_0.7  5.97
lm_scale_10.0_decoder_scale_0.9 5.99
lm_scale_4.0_decoder_scale_0.1  6.04
lm_scale_4.0_decoder_scale_0.08 6.05
lm_scale_4.0_decoder_scale_0.09 6.05
lm_scale_4.0_decoder_scale_0.05 6.06
lm_scale_4.0_decoder_scale_0.03 6.08
lm_scale_6.0_decoder_scale_0.3  6.08
lm_scale_8.0_decoder_scale_0.5  6.08
lm_scale_10.0_decoder_scale_0.7 6.08
lm_scale_4.0_decoder_scale_0.01 6.09
lm_scale_10.0_decoder_scale_0.5 6.15
lm_scale_8.0_decoder_scale_0.3  6.16
lm_scale_6.0_decoder_scale_0.1  6.17
lm_scale_6.0_decoder_scale_0.08 6.17
lm_scale_6.0_decoder_scale_0.09 6.18
lm_scale_6.0_decoder_scale_0.01 6.19
lm_scale_6.0_decoder_scale_0.03 6.19
lm_scale_6.0_decoder_scale_0.05 6.19
lm_scale_10.0_decoder_scale_0.3 6.21
lm_scale_8.0_decoder_scale_0.1  6.23
lm_scale_8.0_decoder_scale_0.09 6.24
lm_scale_8.0_decoder_scale_0.01 6.25
lm_scale_8.0_decoder_scale_0.03 6.25
lm_scale_8.0_decoder_scale_0.05 6.25
lm_scale_8.0_decoder_scale_0.08 6.25
lm_scale_10.0_decoder_scale_0.1 6.26
lm_scale_10.0_decoder_scale_0.05    6.27
lm_scale_10.0_decoder_scale_0.08    6.27
lm_scale_10.0_decoder_scale_0.09    6.27
lm_scale_10.0_decoder_scale_0.01    6.28
lm_scale_10.0_decoder_scale_0.03    6.28
danpovey commented 3 years ago

I just want to make sure you know how to get the unique token sequences from paths in the FSA. (Not sure if this is something that needs fixing, sorry). By unique token sequences I mean without the repeats that come from the CTC, topo, or the epsilons. The way to do this is to use inner_labels='tokens' or something like that when doing the composition with the CTC topo during graph construction, and then use fsa.tokens to obtain these from the lattices when you need them. Any other way may not be correct if we are using the new/simplified CTC topo, because any repeats of the same token will be converted into a single token, so certain words or word-sequences might become impossible to recognize.

csukuangfj commented 3 years ago

Here is the log when program crash while decoding test-other:

INFO:root:batch 1910, cuts processed until now is 1943/2939 (66.110922%)
INFO:root:batch 1920, cuts processed until now is 1953/2939 (66.451174%)
INFO:root:batch 1930, cuts processed until now is 1963/2939 (66.791426%)
[F] /ceph-ly/open-source/latest_k2/k2/k2/python/csrc/torch/torch_util.h:122:k2::Array1<U> k2::FromTorch(at::Tensor&) [with T = in
t] Check failed: tensor.strides()[0] == 1 (4 vs. 1) Expected stride: 1. Given: 4

[ Stack-Trace: ]
/ceph-ly/open-source/latest_k2/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x5b) [0x7fd36a0f66ba]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x70a52) [0x7fd36b423a52]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0xb8c8f) [0x7fd36b46bc8f]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x1088bd) [0x7fd36b4bb8bd]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x102af4) [0x7fd36b4b5af4]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11e695) [0x7fd36b4d1695]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x11db07) [0x7fd36b4d0b07]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116d22) [0x7fd36b4c9d22]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x116f14) [0x7fd36b4c9f14]
/ceph-ly/open-source/latest_k2/k2/build/lib/_k2.cpython-38-x86_64-linux-gnu.so(+0x54187) [0x7fd36b407187]
python(PyCFunction_Call+0x56) [0x5ff8a6]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
/ceph-ly/py38/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object*, _object*)+0x8fd) [0x7fd45ebdb7
8d]
python(PyCFunction_Call+0xfb) [0x5ff94b]
python(_PyObject_MakeTpCall+0x28f) [0x5fff6f]
python(_PyEval_EvalFrameDefault+0x5b9e) [0x57e35e]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(_PyEval_EvalFrameDefault+0x53f0) [0x57dbb0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyFunction_Vectorcall+0x19c) [0x602b2c]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(PyVectorcall_Call+0x51) [0x5ff3b1]
python(_PyEval_EvalFrameDefault+0x1c4a) [0x57a40a]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x442) [0x602dd2]
python(_PyEval_EvalFrameDefault+0x1930) [0x57a0f0]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python(_PyFunction_Vectorcall+0x247) [0x602bd7]
python(_PyEval_EvalFrameDefault+0x619) [0x578dd9]
python(_PyEval_EvalCodeWithName+0x25c) [0x5765ec]
python() [0x662c2e]
python(PyRun_FileExFlags+0x97) [0x662d07]
python(PyRun_SimpleFileExFlags+0x17f) [0x663a1f]

Traceback (most recent call last):
  File "bpe_ctc_att_conformer_decode.py", line 617, in <module>
  File "bpe_ctc_att_conformer_decode.py", line 576, in main
    model=model,
  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "bpe_ctc_att_conformer_decode.py", line 278, in decode
    model=model,
  File "bpe_ctc_att_conformer_decode.py", line 240, in decode_one_batch
    lm_scale_list)
  File "/ceph-ly/py38/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/ceph-ly/open-source/to_submit/lattice_rescore_snwofall/snowfall/snowfall/decoding/lm_rescore.py", line 320, in rescore_w
ith_whole_lattice
    best_paths = k2.shortest_path(inv_lats, use_double_scores=True)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/fsa_algo.py", line 541, in shortest_path
    out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/utils.py", line 449, in fsa_from_unary_function_tensor
    setattr(dest, name, index_select(value, arc_map,
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 159, in index_select
    ans = _IndexSelectFunction.apply(src, index, default_value)
  File "/ceph-ly/open-source/latest_k2/k2/k2/python/k2/ops.py", line 65, in forward
    return _k2.index_select(src, index, default_value)
RuntimeError: Some bad things happed.

Did you use a batch size of 1? If your decoding result is an empty FSA, you will encounter this kind of error when calling k2.shortest_path. The solution is to return rescoring_lats directly. https://github.com/k2-fsa/snowfall/blob/5c979cce1b6a9c9bf72ec484746143b321ae73a7/snowfall/decoding/lm_rescore.py#L306

The reason is that the following line https://github.com/k2-fsa/k2/blob/069425e301472e7ea31ea982ba2a943ac5fcb649/k2/python/k2/fsa.py#L894

            if src_name == 'labels':
                value = value.clone()

returns a tensor with stride == 4 if value is empty.

danpovey commented 3 years ago

We should modify the code that crashes to be insensitive to the stride if any of the dims is zero. Kangwei, perhaps you could do that?

pkufool commented 3 years ago

We should modify the code that crashes to be insensitive to the stride if any of the dims is zero. Kangwei, perhaps you could do that?

Sure.

glynpu commented 3 years ago

I just want to make sure you know how to get the unique token sequences from paths in the FSA. (Not sure if this is something that needs fixing, sorry).

After removing repeat tokens and use log_semiring=False, wer on test-clean decrease from 2.81(last week) to 2.73(now).

details result with different scale combination:

decoder_scale(right)lm_scale(below) 0.1 0.3 0.5 0.6 0.7 0.9 1.0 1.1 1.2 1.3 1.5 1.7 1.9 2.0
0.1 2.98 2.95 2.92 2.9 2.9 2.89 2.89 2.88 2.87 2.86 2.85 2.85 2.85 2.84
0.3 2.91 2.88 2.88 2.88 2.87 2.87 2.85 2.85 2.85 2.85 2.84 2.84 2.83 2.83
0.5 2.88 2.86 2.83 2.84 2.84 2.84 2.83 2.84 2.83 2.82 2.82 2.83 2.83 2.83
0.6 2.86 2.82 2.82 2.81 2.82 2.82 2.82 2.82 2.82 2.81 2.81 2.82 2.82 2.82
0.7 2.87 2.8 2.78 2.79 2.8 2.81 2.81 2.8 2.8 2.8 2.8 2.82 2.82 2.82
0.9 2.99 2.84 2.78 2.76 2.77 2.76 2.76 2.76 2.77 2.78 2.79 2.79 2.8 2.8
1.0 3.12 2.89 2.8 2.77 2.77 2.75 2.74 2.74 2.76 2.77 2.78 2.79 2.79 2.79
1.1 3.32 3.0 2.82 2.8 2.77 2.74 2.73 2.74 2.73 2.74 2.77 2.78 2.78 2.78
1.2 3.58 3.13 2.9 2.85 2.8 2.77 2.74 2.74 2.73 2.74 2.73 2.76 2.77 2.77
1.3 3.87 3.3 3.0 2.92 2.87 2.79 2.76 2.77 2.75 2.74 2.74 2.74 2.75 2.76
1.5 4.45 3.78 3.28 3.17 3.03 2.88 2.85 2.82 2.78 2.77 2.74 2.73 2.74 2.73
1.7 4.84 4.24 3.76 3.54 3.31 3.06 2.99 2.93 2.88 2.84 2.8 2.77 2.75 2.75
1.9 5.11 4.65 4.15 3.95 3.73 3.33 3.2 3.12 3.03 2.98 2.88 2.84 2.8 2.79
2.0 5.19 4.81 4.37 4.11 3.92 3.54 3.34 3.23 3.13 3.05 2.95 2.88 2.83 2.81
glynpu commented 3 years ago

Result of batch_size > 1 is a little than that of batch_size == 1, with 2.74 > 2.73. And the lowest wer is obtained with different lm_scale/decoder_scale setting.

Detail results: decoder_scale(right)lm_scale(below) 0.1 0.3 0.5 0.6 0.7 0.9 1.0 1.1 1.2 1.3 1.5 1.7 1.9 2.0
0.1 2.99 2.98 2.94 2.92 2.92 2.92 2.91 2.91 2.9 2.9 2.89 2.89 2.89 2.89
0.3 2.9 2.9 2.9 2.9 2.9 2.89 2.88 2.87 2.88 2.88 2.86 2.86 2.86 2.86
0.5 2.88 2.85 2.85 2.87 2.86 2.85 2.85 2.86 2.85 2.85 2.85 2.85 2.86 2.86
0.6 2.86 2.83 2.82 2.82 2.84 2.84 2.84 2.84 2.85 2.85 2.85 2.85 2.86 2.86
0.7 2.86 2.81 2.79 2.8 2.81 2.83 2.83 2.83 2.84 2.84 2.85 2.86 2.85 2.85
0.9 2.98 2.84 2.79 2.76 2.77 2.78 2.78 2.8 2.81 2.82 2.82 2.82 2.83 2.84
1.0 3.12 2.88 2.81 2.79 2.77 2.76 2.76 2.78 2.79 2.81 2.82 2.81 2.82 2.82
1.1 3.31 3.0 2.83 2.81 2.79 2.76 2.75 2.75 2.75 2.77 2.8 2.8 2.81 2.81
1.2 3.59 3.13 2.9 2.85 2.81 2.79 2.77 2.76 2.75 2.76 2.76 2.79 2.8 2.8
1.3 3.87 3.3 3.01 2.93 2.87 2.79 2.78 2.79 2.77 2.76 2.76 2.77 2.78 2.79
1.5 4.43 3.81 3.29 3.17 3.05 2.9 2.87 2.84 2.8 2.78 2.77 2.74 2.75 2.75
1.7 4.86 4.28 3.79 3.56 3.32 3.07 3.0 2.95 2.89 2.87 2.82 2.79 2.78 2.77
1.9 5.15 4.68 4.17 3.96 3.74 3.33 3.21 3.13 3.04 2.99 2.88 2.85 2.82 2.81
2.0 5.22 4.83 4.37 4.13 3.92 3.55 3.34 3.24 3.14 3.07 2.95 2.87 2.84 2.82
glynpu commented 3 years ago
As suggested by fangjun, the crash when decoding test-other is solved by batch_size > 1. Current results are:   Wer% on test_clean wer% on test_other
Encoder + ctc 3.32 7.96
Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 *(to be tested)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 2.87 *(to be tested)
Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.86 *(to be tested)
+log_semering=False and remove repeated tokens 2.73 6.11
danpovey commented 3 years ago

Fantastic! I don't think those small differences in WER are significant, likely just noise.

On Tue, Jul 13, 2021 at 8:04 PM LIyong.Guo @.***> wrote:

As suggested by fangjun, the crash when decode test-other is solved by batch_size > 1. Current results are: Wer% on test_clean wer% on test_other Encoder + ctc 3.32 7.96 Encoder + (ctc + 3-gram) + 4-gram lattice rescore 2.92 (to be tested) Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=100 2.87 (to be tested) Encoder + (ctc + 3-gram) + 4-gram lattice rescore + (transformer decoder n-best rescore) num-paths-for-decoder-rescore=500 2.86 *(to be tested) +log_semering=False and remove repeated tokens 2.73 6.11

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/pull/227#issuecomment-879028958, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYPJOUD2HUMG6JAHTDTXQTTFANCNFSM47Z5W3HQ .

glynpu commented 2 years ago

A better model is obained with following modifications:

  feat-norm learning-factor warm-up steps epoch
before no 10 40,000 40 epoch (avg=10, with 26-35 epoch)
current yes 5 80,000(around 10 epochs) 50 epochs (avg=20 with 31-50 epochs)

detail wer on test-clean:

  before current
Encoder + ctc 3.32 2.98( wer of espnet released model is 2.97/3.00)
Encoder + TLG + 4-gram lattice rescore + nbest rescore with transformer decoder with log_semering=False and remove repeated tokens 2.73 2.54

result with diffrernt combination of decoder_scale and lm_scale wer=2.54 is obtained with decoder_scale = 1.7 and lm_scale=1.7

decoder_scale(right)lm_scale(below) 0.1 0.3 0.5 0.6 0.7 0.9 1.0 1.1 1.2 1.3 1.5 1.7 1.9 2.0 2.1 2.2 2.3 2.4 2.5
0.1 2.81 2.78 2.75 2.75 2.74 2.74 2.73 2.73 2.73 2.72 2.72 2.71 2.71 2.7 2.69 2.7 2.7 2.7 2.7
0.3 2.75 2.72 2.7 2.69 2.68 2.68 2.69 2.69 2.69 2.68 2.68 2.67 2.67 2.66 2.66 2.67 2.66 2.66 2.66
0.5 2.7 2.66 2.67 2.66 2.67 2.66 2.65 2.66 2.66 2.65 2.64 2.64 2.64 2.63 2.63 2.63 2.63 2.63 2.63
0.6 2.68 2.66 2.64 2.64 2.63 2.65 2.65 2.65 2.65 2.64 2.63 2.63 2.63 2.62 2.61 2.61 2.61 2.62 2.62
0.7 2.67 2.63 2.62 2.63 2.62 2.63 2.64 2.63 2.64 2.64 2.64 2.62 2.61 2.61 2.61 2.61 2.61 2.62 2.61
0.9 2.73 2.61 2.6 2.6 2.61 2.61 2.62 2.61 2.61 2.61 2.61 2.6 2.61 2.61 2.62 2.61 2.61 2.61 2.61
1.0 2.85 2.65 2.59 2.59 2.6 2.6 2.59 2.6 2.59 2.59 2.59 2.61 2.6 2.61 2.61 2.61 2.61 2.61 2.61
1.1 3.04 2.71 2.62 2.59 2.59 2.6 2.6 2.6 2.58 2.59 2.59 2.59 2.59 2.59 2.6 2.6 2.6 2.61 2.6
1.2 3.31 2.86 2.65 2.62 2.59 2.58 2.57 2.58 2.58 2.58 2.58 2.59 2.59 2.59 2.58 2.58 2.59 2.6 2.6
1.3 3.52 3.04 2.75 2.66 2.62 2.57 2.57 2.56 2.56 2.57 2.57 2.58 2.59 2.59 2.59 2.59 2.58 2.58 2.58
1.5 4.0 3.47 3.06 2.89 2.8 2.64 2.6 2.58 2.59 2.56 2.56 2.55 2.56 2.56 2.57 2.58 2.58 2.58 2.59
1.7 4.41 3.87 3.43 3.26 3.07 2.83 2.74 2.67 2.64 2.6 2.58 2.54 2.56 2.55 2.55 2.55 2.55 2.57 2.57
1.9 4.64 4.26 3.8 3.61 3.41 3.12 2.99 2.86 2.79 2.73 2.64 2.57 2.56 2.54 2.56 2.56 2.55 2.55 2.55
2.0 4.72 4.38 3.98 3.77 3.59 3.29 3.13 3.01 2.88 2.81 2.68 2.62 2.57 2.56 2.56 2.55 2.56 2.56 2.55
danpovey commented 2 years ago

Great!!

Alex-Songs commented 2 years ago

Hi glynpu: This is a very cool work, is there a recipe to reproduce your results? Thanks! @glynpu

glynpu commented 2 years ago

This is a very cool work, is there a recipe to reproduce your results?

Current pr is mainly about decoding part. And #219 is about corresponding training part. Follow egs/librispeech/asr/simple_v1/bpe_run.sh in #219 and run stage0 and stage 1 you will reproduce my work. @Alex-Songs

Alex-Songs commented 2 years ago

thanks! @glynpu