flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

Problem with mkl dnn on an older CPU #949

Open abrigante-dev opened 3 years ago

abrigante-dev commented 3 years ago

I cannot train or decode without getting an mkldnn error, as shown below. I am using the pre-built CPU backend docker image. W2l runs perfectly fine on a different computer. I believe the problem is that the system has a 3rd generation (Ivy Bridge) processor. I have included the error from when trying to decode and the output of "make test". Any support on how to get w2l up and running on this system would be greatly appreciated.

Example run of the Decoder showing the mkldnn eroor: root@fd214a1661ea:~/wav2letter/build# ./Decoder --flagsfile=./decode_run.cfg Loading the LM will be faster if you build a binary file. Reading ./lm/hs/harvard_sentences.arpa ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100


Aborted at 1613068684 (unix time) try "date -d @1613068684" if you are using GNU date PC: @ 0x7f34cecf66a9 mkldnn::impl::get_msec() SIGILL (@0x7f34cecf66a9) received by PID 33 (TID 0x7f34a7fff700) from PID 18446744072884283049; stack trace: @ 0x7f34c84d5890 (unknown) @ 0x7f34cecf66a9 mkldnn::impl::get_msec() @ 0x7f34ced567ff mkldnn::impl::cpu::gemm_convolution_fwd_t::pd_t::create_primitive() @ 0x55c856b173a3 (unknown) @ 0x55c856abf021 (unknown) @ 0x55c856ad57be (unknown) @ 0x55c856abcf1a (unknown) @ 0x55c85685963f (unknown) @ 0x55c85685a023 (unknown) @ 0x55c856864a69 (unknown) @ 0x7f34c84d2827 __pthread_once_slow @ 0x55c85685178d (unknown) @ 0x55c856864c70 (unknown) @ 0x7f34c7fc66df (unknown) @ 0x7f34c84ca6db start_thread @ 0x7f34c768388f clone Illegal instruction

make test run: root@fd214a1661ea:~/wav2letter/build# make test Running tests... Test project /root/wav2letter/build Start 1: W2lCommonTest 1/22 Test #1: W2lCommonTest .................... Passed 2.22 sec Start 2: DictionaryTest 2/22 Test #2: DictionaryTest ................... Passed 0.21 sec Start 3: ProducerConsumerQueueTest 3/22 Test #3: ProducerConsumerQueueTest ........ Passed 0.16 sec Start 4: CriterionTest 4/22 Test #4: CriterionTest ....................Failed 4.93 sec Start 5: Seq2SeqTest 5/22 Test #5: Seq2SeqTest ......................Failed 0.27 sec Start 6: AttentionTest 6/22 Test #6: AttentionTest ....................Exception: Illegal 0.20 sec Start 7: WindowTest 7/22 Test #7: WindowTest ....................... Passed 0.21 sec Start 8: DataTest 8/22 Test #8: DataTest ......................... Passed 0.70 sec Start 9: ListFileDatasetTest 9/22 Test #9: ListFileDatasetTest .............. Passed 0.17 sec Start 10: SoundTest 10/22 Test #10: SoundTest ........................ Passed 0.26 sec Start 11: DecoderTest 11/22 Test #11: DecoderTest ...................... Passed 2.06 sec Start 12: CeplifterTest 12/22 Test #12: CeplifterTest .................... Passed 0.20 sec Start 13: DctTest 13/22 Test #13: DctTest .......................... Passed 0.17 sec Start 14: DerivativesTest 14/22 Test #14: DerivativesTest .................. Passed 0.15 sec Start 15: DitherTest 15/22 Test #15: DitherTest ....................... Passed 8.16 sec Start 16: MfccTest 16/22 Test #16: MfccTest ......................... Passed 1.24 sec Start 17: PreEmphasisTest 17/22 Test #17: PreEmphasisTest .................. Passed 0.19 sec Start 18: SpeechUtilsTest 18/22 Test #18: SpeechUtilsTest .................. Passed 0.85 sec Start 19: TriFilterbankTest 19/22 Test #19: TriFilterbankTest ................ Passed 0.17 sec Start 20: WindowingTest 20/22 Test #20: WindowingTest .................... Passed 0.16 sec Start 21: W2lModuleTest 21/22 Test #21: W2lModuleTest ....................Exception: Illegal 0.24 sec Start 22: RuntimeTest 22/22 Test #22: RuntimeTest ......................***Exception: Illegal 0.17 sec

77% tests passed, 5 tests failed out of 22

Total Test time (real) = 23.26 sec

The following tests FAILED: 4 - CriterionTest (Failed) 5 - Seq2SeqTest (Failed) 6 - AttentionTest (ILLEGAL) 21 - W2lModuleTest (ILLEGAL) 22 - RuntimeTest (ILLEGAL) Errors while running CTest Makefile:72: recipe for target 'test' failed make: *** [test] Error 8

tlikhomanenko commented 3 years ago

Could you first run separately one of these failed tests and post here the full error log?

abrigante-dev commented 3 years ago

Could you first run separately one of these failed tests and post here the full error log?

@tlikhomanenko Here is the log for each of the failed tests. Thank you for the help! 6/22 Testing: AttentionTest 6/22 Test: AttentionTest Command: "/root/wav2letter/build/src/tests/AttentionTest" Directory: /root/wav2letter/build/src/tests "AttentionTest" start time: Feb 17 17:20 UTC Output:

[==========] Running 5 tests from 1 test suite. [----------] Global test environment set-up. [----------] 5 tests from AttentionTest [ RUN ] AttentionTest.NeuralContentAttention [ OK ] AttentionTest.NeuralContentAttention (13 ms) [ RUN ] AttentionTest.SimpleLocationAttention

Test time = 0.03 sec ---------------------------------------------------------- Test Failed. "AttentionTest" end time: Feb 17 17:20 UTC "AttentionTest" time elapsed: 00:00:00 ---------------------------------------------------------- -------------------------------------------------------------------------------------------------------------- 21/22 Testing: W2lModuleTest 21/22 Test: W2lModuleTest Command: "/root/wav2letter/build/src/tests/W2lModuleTest" Directory: /root/wav2letter/build/src/tests "W2lModuleTest" start time: Feb 17 17:20 UTC Output: ---------------------------------------------------------- [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from W2lModuleTest [ RUN ] W2lModuleTest.W2lSeqModule Test time = 0.10 sec ---------------------------------------------------------- Test Failed. "W2lModuleTest" end time: Feb 17 17:20 UTC "W2lModuleTest" time elapsed: 00:00:00 ---------------------------------------------------------- -------------------------------------------------------------------------------------------------------------- 22/22 Testing: RuntimeTest 22/22 Test: RuntimeTest Command: "/root/wav2letter/build/src/tests/RuntimeTest" Directory: /root/wav2letter/build/src/tests "RuntimeTest" start time: Feb 17 17:20 UTC Output: ---------------------------------------------------------- [==========] Running 3 tests from 1 test suite. [----------] Global test environment set-up. [----------] 3 tests from RuntimeTest [ RUN ] RuntimeTest.LoadAndSave Test time = 0.03 sec ---------------------------------------------------------- Test Failed. "RuntimeTest" end time: Feb 17 17:20 UTC "RuntimeTest" time elapsed: 00:00:00 ----------------------------------------------------------
tlikhomanenko commented 3 years ago

Ahh, sorry. We had problems with some of the CPU tests before and the way we used mkldnn.

Recently we have switched to onednn and fixed/improved all CPU implementations and tests. Please use the latest flashlight https://github.com/facebookresearch/flashlight, not wav2letter v0.2 branch.

Can you try this latest flashlight master? Let me know if it works/doesn't work for you.