Closed OUC-lan closed 3 years ago
@OUC-lan — can you run CriterionTest
without that CTCEmptyTarget
test to see if everything else passes? That's given us problems in the past.
I rebuild wav2letter,and run CriterionTest without that CTCEmptyTarget.the results of wav2letter++:
root@0c7246f8bb8b:~/wav2letter/build# make test
Running tests...
Test project /root/wav2letter/build
Start 1: W2lCommonTest
1/20 Test #1: W2lCommonTest .................... Passed 3.37 sec
Start 2: DictionaryTest
2/20 Test #2: DictionaryTest ................... Passed 0.03 sec
Start 3: CriterionTest
3/20 Test #3: CriterionTest ....................***Failed 0.98 sec
Start 4: Seq2SeqTest
4/20 Test #4: Seq2SeqTest ...................... Passed 6.66 sec
Start 5: AttentionTest
5/20 Test #5: AttentionTest .................... Passed 3.05 sec
Start 6: WindowTest
6/20 Test #6: WindowTest ....................... Passed 2.10 sec
Start 7: DataTest
7/20 Test #7: DataTest ......................... Passed 0.93 sec
Start 8: SoundTest
8/20 Test #8: SoundTest ........................ Passed 0.05 sec
Start 9: DecoderTest
9/20 Test #9: DecoderTest ...................... Passed 0.81 sec
Start 10: CeplifterTest
10/20 Test #10: CeplifterTest .................... Passed 0.03 sec
Start 11: DctTest
11/20 Test #11: DctTest .......................... Passed 0.05 sec
Start 12: DerivativesTest
12/20 Test #12: DerivativesTest .................. Passed 0.02 sec
Start 13: DitherTest
13/20 Test #13: DitherTest ....................... Passed 8.04 sec
Start 14: MfccTest
14/20 Test #14: MfccTest ......................... Passed 0.20 sec
Start 15: PreEmphasisTest
15/20 Test #15: PreEmphasisTest .................. Passed 0.03 sec
Start 16: SpeechUtilsTest
16/20 Test #16: SpeechUtilsTest .................. Passed 0.94 sec
Start 17: TriFilterbankTest
17/20 Test #17: TriFilterbankTest ................ Passed 0.02 sec
Start 18: WindowingTest
18/20 Test #18: WindowingTest .................... Passed 0.02 sec
Start 19: W2lModuleTest
19/20 Test #19: W2lModuleTest .................... Passed 2.89 sec
Start 20: RuntimeTest
20/20 Test #20: RuntimeTest ...................... Passed 1.90 sec
95% tests passed, 1 tests failed out of 20
Total Test time (real) = 32.12 sec
The following tests FAILED:
3 - CriterionTest (Failed)
Errors while running CTest
Makefile:104: recipe for target 'test' failed
make: *** [test] Error 8
root@0c7246f8bb8b:~/wav2letter/build# src/tests/CriterionTest
[==========] Running 16 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 16 tests from CriterionTest
[ RUN ] CriterionTest.CTCCost
/root/wav2letter/src/criterion/test/CriterionTest.cpp:93: Failure
The difference between loss1.scalar<float>() and 0.0 is nan, which exceeds kEpsilon, where
loss1.scalar<float>() evaluates to nan,
0.0 evaluates to 0, and
kEpsilon evaluates to 9.9999997473787516e-06.
[ FAILED ] CriterionTest.CTCCost (619 ms)
[ RUN ] CriterionTest.CTCJacobian
unknown file: Failure
C++ exception with description "Error: compute_ctc_loss, stat = execution failed" thrown in the test body.
[ FAILED ] CriterionTest.CTCJacobian (194 ms)
[ RUN ] CriterionTest.Batching
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void* cuda::MemoryManager::nativeAlloc(size_t)
In file src/backend/cuda/memory.cpp:149
CUDA Error (77): an illegal memory access was encountered
In function af::array af::randu(const af::dim4&, af::dtype)
In file src/api/cpp/random.cpp:78" thrown in the test body.
[ FAILED ] CriterionTest.Batching (0 ms)
[ RUN ] CriterionTest.CTCCompareTensorflow
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = float]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.CTCCompareTensorflow (0 ms)
[ RUN ] CriterionTest.ViterbiPath
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void cuda::evalNodes(std::vector<cuda::Param<T> >&, std::vector<common::Node*>) [with T = float]
In file src/backend/cuda/jit.cpp:329
CU Error CUDA_ERROR_ILLEGAL_ADDRESS(700): an illegal memory access was encountered
In function af::array::array_proxy& af::array::array_proxy::operator=(const af::array&)
In file src/api/cpp/array.cpp:470" thrown in the test body.
[ FAILED ] CriterionTest.ViterbiPath (0 ms)
[ RUN ] CriterionTest.FCCCost
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = float]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.FCCCost (0 ms)
[ RUN ] CriterionTest.FCCJacobian
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void cuda::evalNodes(std::vector<cuda::Param<T> >&, std::vector<common::Node*>) [with T = int]
In file src/backend/cuda/jit.cpp:329
CU Error CUDA_ERROR_ILLEGAL_ADDRESS(700): an illegal memory access was encountered
In function T* af::array::device() const [with T = void]
In file src/api/cpp/array.cpp:941" thrown in the test body.
[ FAILED ] CriterionTest.FCCJacobian (0 ms)
[ RUN ] CriterionTest.FACCost
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = float]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.FACCost (0 ms)
[ RUN ] CriterionTest.FACJacobian
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = int]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.FACJacobian (1 ms)
[ RUN ] CriterionTest.ASGCost
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void cuda::kernel::identity(cuda::Param<T>) [with T = float]
In file src/backend/cuda/kernel/identity.hpp:58
CUDA Error (77): an illegal memory access was encountered
In function af::array af::identity(const af::dim4&, af::dtype)
In file src/api/cpp/data.cpp:152" thrown in the test body.
[ FAILED ] CriterionTest.ASGCost (0 ms)
[ RUN ] CriterionTest.ASGJacobian
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = int]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.ASGJacobian (0 ms)
[ RUN ] CriterionTest.LinSegJacobian
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Array<T>::Array(af::dim4, const T*, bool, bool) [with T = int]
In file src/backend/cuda/Array.cpp:74
CUDA Error (77): an illegal memory access was encountered
In function void {anonymous}::initDataArray(void**, const void*, af::dtype, af::source, dim_t, dim_t, dim_t, dim_t)
In file src/api/cpp/array.cpp:103" thrown in the test body.
[ FAILED ] CriterionTest.LinSegJacobian (0 ms)
[ RUN ] CriterionTest.ASGBatching
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void* cuda::MemoryManager::nativeAlloc(size_t)
In file src/backend/cuda/memory.cpp:149
CUDA Error (77): an illegal memory access was encountered
In function af::array af::randu(const af::dim4&, af::dtype)
In file src/api/cpp/random.cpp:78" thrown in the test body.
[ FAILED ] CriterionTest.ASGBatching (0 ms)
[ RUN ] CriterionTest.ASGCompareLua
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void cuda::kernel::identity(cuda::Param<T>) [with T = float]
In file src/backend/cuda/kernel/identity.hpp:58
CUDA Error (77): an illegal memory access was encountered
In function af::array af::identity(const af::dim4&, af::dtype)
In file src/api/cpp/data.cpp:152" thrown in the test body.
[ FAILED ] CriterionTest.ASGCompareLua (0 ms)
[ RUN ] CriterionTest.LinSegCompareLua
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void cuda::kernel::identity(cuda::Param<T>) [with T = float]
In file src/backend/cuda/kernel/identity.hpp:58
CUDA Error (77): an illegal memory access was encountered
In function af::array af::identity(const af::dim4&, af::dtype)
In file src/api/cpp/data.cpp:152" thrown in the test body.
[ FAILED ] CriterionTest.LinSegCompareLua (0 ms)
[ RUN ] CriterionTest.AsgSerialization
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function void* cuda::MemoryManager::nativeAlloc(size_t)
In file src/backend/cuda/memory.cpp:149
CUDA Error (77): an illegal memory access was encountered
In function af::array af::identity(const af::dim4&, af::dtype)
In file src/api/cpp/data.cpp:152" thrown in the test body.
[ FAILED ] CriterionTest.AsgSerialization (0 ms)
[----------] 16 tests from CriterionTest (814 ms total)
[----------] Global test environment tear-down
[==========] 16 tests from 1 test case ran. (814 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 16 tests, listed below:
[ FAILED ] CriterionTest.CTCCost
[ FAILED ] CriterionTest.CTCJacobian
[ FAILED ] CriterionTest.Batching
[ FAILED ] CriterionTest.CTCCompareTensorflow
[ FAILED ] CriterionTest.ViterbiPath
[ FAILED ] CriterionTest.FCCCost
[ FAILED ] CriterionTest.FCCJacobian
[ FAILED ] CriterionTest.FACCost
[ FAILED ] CriterionTest.FACJacobian
[ FAILED ] CriterionTest.ASGCost
[ FAILED ] CriterionTest.ASGJacobian
[ FAILED ] CriterionTest.LinSegJacobian
[ FAILED ] CriterionTest.ASGBatching
[ FAILED ] CriterionTest.ASGCompareLua
[ FAILED ] CriterionTest.LinSegCompareLua
[ FAILED ] CriterionTest.AsgSerialization
16 FAILED TESTS
@OUC-lan just to eliminate some things, can you try building and running the warpctc tests independently of wav2letter and check that everything works?
Can you also confirm your CUDA version, CUDA driver version, and GPU model/type?
cc @jcai1
close due to inactivity + too old issue.
I build wav2letter by docker.
When trying to run Train, I get this error:
Then I make test for wav2letter++ and flashlight,
the results of wav2letter++:
the results of flashlight:
Any help please?