Closed ahban closed 7 months ago
PS: I install k2 by compiling the source code.
Are you able to run the tests in https://github.com/k2-fsa/k2/tree/master/k2/python/tests ?
For instance, you can do
cd k2/python/tests
python3 ./remove_epsilon_self_loops_test.py
python3 ./remove_epsilon_test.py
The second script fails to run. and the output is below.
.F/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [32,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [96,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [1,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [64,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [65,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [66,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
[F] /home/ddp-aban/soft/k2/k2/csrc/array.h:385:T k2::Array1<T>::operator[](int32_t) const [with T = int; int32_t = int] Check failed: ret == cudaSuccess (710 vs. 0) Error: device-side assert triggered.
[ Stack-Trace: ]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2_log.so(k2::internal::GetStackTrace()+0x34) [0x7f2aa1646c34]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::Array1<int>::operator[](int) const+0x842) [0x7f2aa1fa70e2]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::Renumbering::ComputeOld2New()+0x1c1) [0x7f2aa1fa13f1]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::Renumbering::ComputeNew2Old()+0x998) [0x7f2aa1fa2f88]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::SubsetRaggedShape(k2::RaggedShape&, k2::Renumbering&, int, k2::Array1<int>*)+0x330) [0x7f2aa2169f00]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x106971) [0x7f2aa3858971]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x106f56) [0x7f2aa3858f56]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x14d248) [0x7f2aa389f248]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x139728) [0x7f2aa388b728]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x32a34) [0x7f2aa3784a34]
python(+0x13c00e) [0x55ff8cc8500e]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(+0x166ca0) [0x55ff8ccafca0]
python(_PyEval_EvalFrameDefault+0x4f83) [0x55ff8cd24923]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
python(_PyEval_EvalFrameDefault+0x1510) [0x55ff8cd20eb0]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(_PyEval_EvalFrameDefault+0x4f83) [0x55ff8cd24923]
python(_PyFunction_Vectorcall+0x1b7) [0x55ff8cd167e7]
python(+0x166b2e) [0x55ff8ccafb2e]
python(_PyEval_EvalFrameDefault+0x71b) [0x55ff8cd200bb]
python(_PyFunction_Vectorcall+0x1b7) [0x55ff8cd167e7]
python(_PyEval_EvalFrameDefault+0x4c0) [0x55ff8cd1fe60]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
python(PyObject_Call+0x7d) [0x55ff8cc8020d]
python(_PyEval_EvalFrameDefault+0x1f07) [0x55ff8cd218a7]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
python(_PyObject_FastCallDict+0x5f) [0x55ff8cca762f]
python(+0x194d2b) [0x55ff8ccddd2b]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(_PyEval_EvalFrameDefault+0x4eff) [0x55ff8cd2489f]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
python(PyObject_Call+0x7d) [0x55ff8cc8020d]
python(_PyEval_EvalFrameDefault+0x1f07) [0x55ff8cd218a7]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
python(_PyObject_FastCallDict+0x5f) [0x55ff8cca762f]
python(+0x194d2b) [0x55ff8ccddd2b]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(_PyEval_EvalFrameDefault+0x4eff) [0x55ff8cd2489f]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
E[F] /home/ddp-aban/soft/k2/k2/csrc/pinned_context.cu:313:virtual void k2::PinnedContext::CopyDataTo(size_t, const void*, k2::ContextPtr, void*) Check failed: ret == cudaSuccess (710 vs. 0) Error: device-side assert triggered.
[ Stack-Trace: ]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2_log.so(k2::internal::GetStackTrace()+0x34) [0x7f2aa1646c34]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::PinnedContext::CopyDataTo(unsigned long, void const*, std::shared_ptr<k2::Context>, void*)+0xe5c) [0x7f2aa213b10c]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::PytorchCpuContext::CopyDataTo(unsigned long, void const*, std::shared_ptr<k2::Context>, void*)+0x14d) [0x7f2aa22c908d]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::Array1<int>::CopyFrom(k2::Array1<int> const&)+0x8c) [0x7f2aa1fb2b1c]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/lib64/libk2context.so(k2::RaggedShape::To(std::shared_ptr<k2::Context>, bool) const+0x5bd) [0x7f2aa2141c5d]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0xd6f0e) [0x7f2aa3828f0e]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0xd74cb) [0x7f2aa38294cb]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0xce1fb) [0x7f2aa38201fb]
/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/_k2.cpython-38-x86_64-linux-gnu.so(+0x32a34) [0x7f2aa3784a34]
python(+0x13c00e) [0x55ff8cc8500e]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(+0x166ca0) [0x55ff8ccafca0]
python(_PyEval_EvalFrameDefault+0x4f83) [0x55ff8cd24923]
python(_PyFunction_Vectorcall+0x1b7) [0x55ff8cd167e7]
python(+0x166b2e) [0x55ff8ccafb2e]
python(_PyEval_EvalFrameDefault+0x4f83) [0x55ff8cd24923]
python(_PyFunction_Vectorcall+0x1b7) [0x55ff8cd167e7]
python(+0x166b2e) [0x55ff8ccafb2e]
python(_PyEval_EvalFrameDefault+0x71b) [0x55ff8cd200bb]
python(_PyFunction_Vectorcall+0x1b7) [0x55ff8cd167e7]
python(_PyEval_EvalFrameDefault+0x4c0) [0x55ff8cd1fe60]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
python(PyObject_Call+0x7d) [0x55ff8cc8020d]
python(_PyEval_EvalFrameDefault+0x1f07) [0x55ff8cd218a7]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
python(_PyObject_FastCallDict+0x5f) [0x55ff8cca762f]
python(+0x194d2b) [0x55ff8ccddd2b]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(_PyEval_EvalFrameDefault+0x4eff) [0x55ff8cd2489f]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
python(PyObject_Call+0x7d) [0x55ff8cc8020d]
python(_PyEval_EvalFrameDefault+0x1f07) [0x55ff8cd218a7]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
python(_PyObject_FastCallDict+0x5f) [0x55ff8cca762f]
python(+0x194d2b) [0x55ff8ccddd2b]
python(_PyObject_MakeTpCall+0x3bf) [0x55ff8cc7a13f]
python(_PyEval_EvalFrameDefault+0x4eff) [0x55ff8cd2489f]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x534) [0x55ff8cd16b64]
python(+0x166bf8) [0x55ff8ccafbf8]
python(PyObject_Call+0x7d) [0x55ff8cc8020d]
python(_PyEval_EvalFrameDefault+0x1f07) [0x55ff8cd218a7]
python(_PyEval_EvalCodeWithName+0x260) [0x55ff8cd15600]
python(_PyFunction_Vectorcall+0x594) [0x55ff8cd16bc4]
E..
======================================================================
ERROR: test_autograd_remove_epsilon_and_add_self_loops (__main__.TestRemoveEpsilonDevice)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./remove_epsilon_test.py", line 277, in test_autograd_remove_epsilon_and_add_self_loops
dest = k2.remove_epsilon_and_add_self_loops(src)
File "/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/fsa_algo.py", line 647, in remove_epsilon_and_add_self_loops
out_fsa = k2.utils.fsa_from_unary_function_ragged(
File "/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/utils.py", line 521, in fsa_from_unary_function_ragged
setattr(dest, name, new_value.remove_values_eq(filler))
RuntimeError:
Some bad things happened. Please read the above error messages and stack
trace. If you are using Python, the following command may be helpful:
gdb --args python /path/to/your/code.py
(You can use `gdb` to debug the code. Please consider compiling
a debug version of k2.).
If you are unable to fix it, please open an issue at:
https://github.com/k2-fsa/k2/issues/new
======================================================================
ERROR: test1 (__main__.TestRemoveEpsilonDeviceFillers)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./remove_epsilon_test.py", line 342, in test1
fsa = k2.Fsa.from_str(s, aux_label_names=['foo']).to(device)
File "/home/ddp-aban/soft/anaconda3/envs/k2/lib/python3.8/site-packages/k2-1.17.dev20220720+cuda11.1.torch1.8.1-py3.8-linux-x86_64.egg/k2/fsa.py", line 1097, in to
ans = Fsa(self.arcs.to(device), properties=self.properties)
RuntimeError:
Some bad things happened. Please read the above error messages and stack
trace. If you are using Python, the following command may be helpful:
gdb --args python /path/to/your/code.py
(You can use `gdb` to debug the code. Please consider compiling
a debug version of k2.).
If you are unable to fix it, please open an issue at:
https://github.com/k2-fsa/k2/issues/new
======================================================================
FAIL: test_autograd (__main__.TestRemoveEpsilonDevice)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./remove_epsilon_test.py", line 210, in test_autograd
assert dest.int_attr == expected_int_attr
AssertionError
----------------------------------------------------------------------
Ran 6 tests in 3.129s
FAILED (failures=1, errors=2)
Someone reported the same error sometime before with cuda 11.1 + torch 1.8.0
.
But the error disappears without changing any code just by switching to cuda 10.2 + torch 1.10.0
.
cool. I am moving to install torch 1.12 to have a try.
@csukuangfj 1.12 works well on Centos 7. many thanks
After installing k2, Lhotse, and icefall-related packages. Testing
yesno
shows me the following errors. I know this is an old problem as mentioned in #297, and it should have been fixed. However the problem still exists.The version of k2
PS: I install k2 by compiling the source code.
the version of LHotse
My os is centos 7.