TomHeaven / tensorflow-osx-build

Off-the-shelf python package of tensorflow with CUDA support for Mac OS.
142 stars 20 forks source link

Illegal instruction: 4 when importing tensorflow #8

Closed hudarsono closed 5 years ago

hudarsono commented 5 years ago

Hi,

I tried using this build tensorflow-1.12.0-cp36-cp36m-macosx_10_12_x86_64.whl and use python 3.6.5_1 on Mac OS 10.13.6, it failed to import with :

(tfgpu) MacBook-Pro:tfgpu hud$ python
Python 3.6.5 (default, Jun 17 2018, 12:13:06) 
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Illegal instruction: 4

Any idea?

hudarsono commented 5 years ago

Update : I have tried with all 3 build of tensorflow 1.12.0 CUDA 10, and 3 python versions, all results in "Illegal instruction: 4"

TomHeaven commented 5 years ago

Try using the lastest build tf 1.13.1. You may have met a known bug as reported here: https://github.com/tensorflow/tensorflow/issues/25822. However, py 2.7 and py 3.6 work works fine in my mac.

TomHeaven commented 5 years ago

@hudarsono By the way, what's your hardware configuration? Do you have a Nvidia GPU?

hudarsono commented 5 years ago

I am on Macbook Pro 15 retina mid 2012 with NVIDIA 650M.

I cant use tf 1.13.1 as on your release note, it doesnt support compute capability 3.0 which my gpu has.

Btw, do you think having gpu support for my card will make worthy diff in performance to pursue this forward?

TomHeaven commented 5 years ago

Maybe you can try an older release: https://github.com/TomHeaven/tensorflow-osx-build/releases/tag/v1.10.0_cu90 or https://github.com/TomHeaven/tensorflow-osx-build/releases/tag/v1.10.0_cu90_py37 with CUDA 9.

Your display card is too old and I'm afraid it can only run small models.

hudarsono commented 5 years ago

The problem is I couldn't get CUDA 9 to works on High Sierra 10.13.6. Only CUDA 10 works. As the CUDA runtime version that support HighSierra requires CUDA 10 I think.

TomHeaven commented 5 years ago

@hudarsono I will add compute compatibility 3.0 back to supported list and update the release.

hudarsono commented 5 years ago

Many thanks, will be looking forward to try it.

TomHeaven commented 5 years ago

Try this release: https://github.com/TomHeaven/tensorflow-osx-build/releases/tag/v1.13.1_cu100_full.

hudarsono commented 5 years ago

Strange, it still result in illegal instruction 4. I noticed that when running python, it is showing : `(tf) MacBook-Pro:tf hud$ python Python 3.7.3 (default, Mar 27 2019, 09:23:39) [Clang 10.0.0 (clang-1000.11.45.5)] on darwin

import tensorflow Illegal instruction: 4 (tf) MacBook-Pro:tf hud$ `

Its showing Clang 10.0.0, which is previous command line tool. I have since downgraded and clang --version already show the correct version : Apple LLVM version 8.1.0 (clang-802.0.42). However python still show clang 10.0.0. Could it be the reason?

Another info : this macbook only support AVX1.0 not AVX2. Does the build require AVX2 ?

This is the error dialog : `Process: Python [618] Path: /usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: 3.7.3 (3.7.3) Code Type: X86-64 (Native) Parent Process: bash [485] Responsible: Python [618] User ID: 501

Date/Time: 2019-04-25 21:22:26.520 +0700 OS Version: Mac OS X 10.13.6 (17G6030) Report Version: 12 Anonymous UUID: DDF3F55A-5B46-82B0-51A7-516EEFCD893D

Time Awake Since Boot: 82 seconds

System Integrity Protection: disabled

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_BAD_INSTRUCTION (SIGILL) Exception Codes: 0x0000000000000001, 0x0000000000000000 Exception Note: EXC_CORPSE_NOTIFY

Termination Signal: Illegal instruction: 4 Termination Reason: Namespace SIGNAL, Code 0x4 Terminating Process: exc handler [0] `

TomHeaven commented 5 years ago

I think AVX1.0 may be the cause since your old CPU may not support new instruction. You can build your own tensorflow using tutorial provided in the master branch.

Use --config=nonccl as follows

bazel build --config=opt --config=nonccl  //tensorflow/tools/pip_package:build_pip_package

then you can skip the NCCL part in the tutorial.

hudarsono commented 5 years ago

Ok, thanks for the help so far.

hudarsono commented 5 years ago

Update : I have successfully built working tensorflow gpu v1.10. Btw could you release a patch for v1.13.1? is it the same with v1.10?

montaguegabe commented 5 years ago

I just successfully built with tensorflow v1.13.1 using the instructions included in this repo. Tricky things along the way:

1) Had to build without NCCL, and I learned this is only possible if TF_NCCL_VERSION, NCCL_INSTALL_PATH are not set, and you build with--config=nonccl 2) I had to integrate step 4 of https://medium.com/@mattias.arro/installing-tensorflow-1-2-from-sources-with-gpu-support-on-macos-4f2c5cab8186 regarding OpenMP

TomHeaven commented 5 years ago

Great! I should update the tutorial but just do not have spare time. I'm closing this issue.

ShaunHolt commented 4 years ago

Just in case for anyone using Condra and running into this issue. if you use... condra search tensorflow .... you will find a list of tensorflows you can use. No GPU, no avx? No problem. Uninstall the tensorflow you have installed..... type in.... condra install tensorflow-mkl ..... it has worked for me with the 2.1.0 tensorflow.... and now i can use Rasa/Tensorflow. Hopefully this helps others too.