NervanaSystems / ngraph-tf

Bridge to connect nGraph with TensorFlow
Other
53 stars 16 forks source link

Illegal instruction (core dumped) when trying to run with nGraph-TensorFlow bridge #446

Open wwwwcu opened 5 years ago

wwwwcu commented 5 years ago

Hi everyone,

I'm trying to run an TensorFlow example with nGraph-TensorFlow bridge but I receive the following error when sess.run() is called: Illegal instruction (core dumped). When I run the code without importing ngraph_bridge it works perfectly. I found that someone has met this problem before, but no system info is provided and the issue has beed closed. So I list my system info: OS platform: Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-116-generic x86_64) CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz python:Python 3.5.2 GCC: GCC 5.4.0

I built ngraph_bridge with Option 1: Use a pre-built nGraph-TensorFlow bridge, and when running the command: python -c "import tensorflow as tf; print('TensorFlow version: ',tf.version);import ngraph_bridge; print(ngraph_bridge.version)" The result is: TensorFlow version: 1.12.0 nGraph bridge version: b'0.11.0' nGraph version used for this build: b'0.14.0+56a54ca' TensorFlow version used for this build: v1.12.0-0-ga6d8ffa

I also tried to specify the version of ngraph_bridge==0.8.0, the result is: TensorFlow version: r 1.12.0 TensorFlow version installed: 1.12.0 (v1.12.0-0-ga6d8ffae09) nGraph bridge built with: 1.12.0 (v1.12.0-0-ga6d8ffa) b'0.8.0' But core dumped has occured on both versions.

Thank you!

yunzhongyan0 commented 5 years ago

I have the same problem. Have you solved it?

SleepProgger commented 5 years ago

Same problem here using an AMD FX-4300. If i got the right it seems like the pip version of ngraph-tf is compiled with CPU features my CPU doesn't support (BMI2)

# gdb --args python3 keras_sample.py
(gdb) r
...
Thread 1 "python3" received signal SIGILL, Illegal instruction.
0x00007fffd393f86c in tensorflow::ngraph_bridge::(anonymous namespace)::DeadnessAnalysisImpl::Populate() () from /home/nope/venvs/ngraph-tf_pip_36/lib/python3.6/site-packages/ngraph_bridge/libngraph_bridge.so
(gdb) x/10i $pc
=> 0x7fffd393f86c <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+636>:  shlx   %ebp,%edx,%r12d
   0x7fffd393f871 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+641>:  mov    %r12d,%eax
   0x7fffd393f874 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+644>:  vcvtsi2sd %rax,%xmm0,%xmm0
   0x7fffd393f879 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+649>:  vmulsd 0x13497(%rip),%xmm0,%xmm3        # 0x7fffd3952d18
   0x7fffd393f881 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+657>:  vucomisd %xmm3,%xmm1
   0x7fffd393f885 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+661>:  vmovsd %xmm3,0x8(%rsp)
   0x7fffd393f88b <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+667>:  jae    0x7fffd393f868 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+632>
   0x7fffd393f88d <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+669>:  mov    $0x1,%eax
   0x7fffd393f892 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+674>:  mov    $0xffffffffffffffff,%rdi
   0x7fffd393f899 <_ZN10tensorflow13ngraph_bridge12_GLOBAL__N_120DeadnessAnalysisImpl8PopulateEv+681>:  shlx   %ebp,%eax,%ecx

shlx seem to be part of BMI2.