k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
898 stars 286 forks source link

tdnn_lstm_ctc training error of librispeech #49

Closed cdxie closed 6 months ago

cdxie commented 3 years ago

hi, when I run the tdnn_lstm_ctc training of librispeech on Epoch 5, I got one error,please take a look,thanks log of error:

2021-09-17 13:28:10,440 INFO [train.py:450] Epoch 5, batch 8620, batch avg loss 1.0633, total avg loss: 1.1221, batch size: 39 2021-09-17 13:28:22,302 INFO [train.py:450] Epoch 5, batch 8630, batch avg loss 1.1049, total avg loss: 1.1507, batch size: 41 2021-09-17 13:28:25,554 WARNING [cut.py:1694] To perform mix, energy must be non-zero and non-negative (got 0.0). MonoCut with id "845a0a69-f758-7b6a-90d8-ba99fa1795c4" will not be mixed in. 2021-09-17 13:28:36,682 INFO [train.py:450] Epoch 5, batch 8640, batch avg loss 1.1305, total avg loss: 1.1622, batch size: 40 2021-09-17 13:28:49,311 INFO [train.py:450] Epoch 5, batch 8650, batch avg loss 1.1228, total avg loss: 1.1774, batch size: 37 [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] [F] /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void /workspace/k2/k2/csrc/intersect_dense.cu:863:lambda [](signed int)->void::operator()(signed int)->void block:[0,0,0], thread: [32,0,0] block:[0,0,0], thread: [33,0,0] block:[0,0,0], thread: [34,0,0] block:[0,0,0], thread: [35,0,0] block:[0,0,0], thread: [36,0,0] block:[0,0,0], thread: [37,0,0] block:[0,0,0], thread: [38,0,0] block:[0,0,0], thread: [39,0,0] block:[0,0,0], thread: [0,0,0] block:[0,0,0], thread: [1,0,0] block:[0,0,0], thread: [2,0,0] block:[0,0,0], thread: [3,0,0] block:[0,0,0], thread: [4,0,0] block:[0,0,0], thread: [5,0,0] block:[0,0,0], thread: [6,0,0] block:[0,0,0], thread: [7,0,0] block:[0,0,0], thread: [8,0,0] block:[0,0,0], thread: [9,0,0] block:[0,0,0], thread: [10,0,0] block:[0,0,0], thread: [11,0,0] block:[0,0,0], thread: [12,0,0] block:[0,0,0], thread: [13,0,0] block:[0,0,0], thread: [14,0,0] block:[0,0,0], thread: [15,0,0] block:[0,0,0], thread: [16,0,0] block:[0,0,0], thread: [17,0,0] block:[0,0,0], thread: [18,0,0] block:[0,0,0], thread: [19,0,0] block:[0,0,0], thread: [20,0,0] block:[0,0,0], thread: [21,0,0] block:[0,0,0], thread: [22,0,0] block:[0,0,0], thread: [23,0,0] block:[0,0,0], thread: [24,0,0] block:[0,0,0], thread: [25,0,0] block:[0,0,0], thread: [26,0,0] block:[0,0,0], thread: [27,0,0] block:[0,0,0], thread: [28,0,0] block:[0,0,0], thread: [29,0,0] block:[0,0,0], thread: [30,0,0] block:[0,0,0], thread: [31,0,0] Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: Check failed: tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_s/workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [32,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [33,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [34,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [35,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [36,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [37,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [38,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [39,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [0,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [1,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [2,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [3,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [4,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [5,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [6,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [7,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [8,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [9,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [10,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [11,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [12,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [13,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [14,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [15,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [16,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [17,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [18,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [19,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [20,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [21,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [22,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [23,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [24,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [25,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [26,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [27,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [28,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [29,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [30,0,0] Assertion Some bad things happened failed. /workspace/k2/k2/csrc/intersect_dense.cu:863: lambda [](signed int)->void::operator()(signed int)->void: block: [0,0,0], thread: [31,0,0] Assertion Some bad things happened failed. tart || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0tot_score_end == tot_score_start || fabs(tot_score_end - tot_score_start) < 1.0 nannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannan vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs vs nannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannannan [F] /workspace/k2/k2/csrc/array.h:341:T k2::Array1::operator const [with T = int; int32_t = int] Check failed: ret == cudaSuccess (710 vs. 0) Error: device-side assert triggered. [ Stack-Trace: ] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2_log.so(k2::internal::GetStackTrace()+0x47) [0x7f8c456419f7] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2context.so(k2::Array1::operator const+0xeb9) [0x7f8c4593c8e9] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2context.so(k2::Renumbering::ComputeOld2New()+0x14e) [0x7f8c459377ee] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2context.so(k2::Renumbering::ComputeNew2Old()+0x7f8) [0x7f8c45938f68] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2context.so(k2::MultiGraphDenseIntersect::FormatOutput(k2::Array1, k2::Array1)+0x7ec) [0x7f8c45a9e47c] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/libk2context.so(k2::IntersectDense(k2::Ragged&, k2::DenseFsaVec&, k2::Array1 const, float, k2::Ragged, k2::Array1, k2::Array1)+0x420) [0x7f8c45a8e900] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x65390) [0x7f8c4bad4390] /opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x1ee9e) [0x7f8c4ba8de9e] python3(PyCFunction_Call+0x58) [0x55e5b8fb72d8] python3(_PyObject_MakeTpCall+0x23c) [0x55e5b8fa6edc] python3(_PyEval_EvalFrameDefault+0x11dd) [0x55e5b902f4ad] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(PyObject_CallObject+0x52) [0x55e5b9002982] /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so(THPFunction_apply(_object, _object)+0x8fd) [0x7f8d427dc39d] python3(PyCFunction_Call+0xe0) [0x55e5b8fb7360] python3(_PyObject_MakeTpCall+0x23c) [0x55e5b8fa6edc] python3(_PyEval_EvalFrameDefault+0x45a9) [0x55e5b9032879] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x10425f) [0x55e5b8f6725f] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x19aac9) [0x55e5b8ffdac9] python3(PyObject_Call+0x414) [0x55e5b8fa7874] python3(_PyEval_EvalFrameDefault+0x2088) [0x55e5b9030358] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyObject_Call_Prepend+0x181) [0x55e5b8ffe051] python3(+0x19b3fa) [0x55e5b8ffe3fa] python3(_PyObject_MakeTpCall+0x23c) [0x55e5b8fa6edc] python3(_PyEval_EvalFrameDefault+0x475) [0x55e5b902e745] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x103562) [0x55e5b8f66562] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x103562) [0x55e5b8f66562] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x103562) [0x55e5b8f66562] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(_PyFunction_Vectorcall+0x1e3) [0x55e5b8ffd593] python3(+0x103562) [0x55e5b8f66562] python3(_PyFunction_Vectorcall+0x10b) [0x55e5b8ffd4bb] python3(+0x10425f) [0x55e5b8f6725f] python3(_PyEval_EvalCodeWithName+0x300) [0x55e5b8ffc760] python3(PyEval_EvalCode+0x23) [0x55e5b90914e3] python3(+0x22e584) [0x55e5b9091584] python3(+0x2547c4) [0x55e5b90b77c4] python3(+0x115620) [0x55e5b8f78620]

Traceback (most recent call last): File "./tdnn_lstm_ctc/train.py", line 616, in main() File "./tdnn_lstm_ctc/train.py", line 612, in main run(rank=0, world_size=1, args=args) File "./tdnn_lstm_ctc/train.py", line 575, in run train_one_epoch( File "./tdnn_lstm_ctc/train.py", line 424, in train_one_epoch loss = compute_loss( File "./tdnn_lstm_ctc/train.py", line 317, in compute_loss loss = k2.ctc_loss( File "/opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/k2/ctc_loss.py", line 136, in ctc_loss return m(decoding_graph, dense_fsa_vec, target_lengths) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/k2/ctc_loss.py", line 80, in forward lattice = intersect_dense(decoding_graph, dense_fsa_vec, File "/opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/k2/autograd.py", line 810, in intersect_dense _IntersectDenseFunction.apply(a_fsas, b_fsas, out_fsa, output_beam, File "/opt/conda/lib/python3.8/site-packages/k2-1.6.dev20210906+cuda11.1.torch1.8.0-py3.8-linux-x86_64.egg/k2/autograd.py", line 550, in forward ragged_arc, arc_map_a, arc_map_b = _k2.intersect_dense( RuntimeError: Some bad things happed.

csukuangfj commented 3 years ago

Which version of k2 are you using? You can use

$ python3 -m k2.version

to find that.

If you are not using the latest k2 (most probably), please update your k2.

cdxie commented 3 years ago

Which version of k2 are you using? You can use

$ python3 -m k2.version

to find that.

If you are not using the latest k2 (most probably), please update your k2.

ok, the version of k2 is 1.6 that I used , you mean if i change the of k2 version(from 1.6 to 1.8 ) can solve this error ? root@17eefe7f5ceb:/workspace# python3 -m k2.version Collecting environment information...

k2 version: 1.6 Build type: Release Git SHA1: 818b138b33eabe440601df8910a2b97ac088594b Git date: Thu Aug 26 13:03:25 2021 Cuda used to build k2: 11.1 cuDNN used to build k2: 8.0.5 Python version used to build k2: 3.8 OS used to build k2: CMake version: 3.18.0 GCC version: 7.5.0 CMAKE_CUDA_FLAGS: --expt-extended-lambda -gencode arch=compute_35,code=sm_35 --expt-extended-lambda -gencode arch=compute_50,code=sm_50 --expt-extended-lambda -gencode arch=compute_60,code=sm_60 --expt-extended-lambda -gencode arch=compute_61,code=sm_61 --expt-extended-lambda -gencode arch=compute_70,code=sm_70 --expt-extended-lambda -gencode arch=compute_75,code=sm_75 -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options -Wall --compiler-options -Wno-unknown-pragmas --compiler-options -Wno-strict-overflow CMAKE_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-strict-overflow PyTorch version used to build k2: 1.8.0 PyTorch is using Cuda: 11.1 NVTX enabled: True With CUDA: True Disable debug: True Sync kernels : False Disable checks: False

csukuangfj commented 2 years ago

Did you fix the error by updating k2?