fo40225 / tensorflow-windows-wheel

Tensorflow prebuilt binary for Windows
3.67k stars 1.53k forks source link

[Question(s)] Can you build a C++ standalone shared-lib (no python dep)? #6

Open BraynStorm opened 6 years ago

BraynStorm commented 6 years ago

Title. Also, I see that you build with SSE too, does this mean you can build TF as a 32bit binary?

fo40225 commented 6 years ago

I have not tried to build for x86, maybe I can try to build.

fo40225 commented 6 years ago

Build fail, it seems to require some code fix.

"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_python_build_pip_package.vcxproj" (預設目標) (1) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\pywrap_tensorflow_internal.vcxproj" (預設目標) (2) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\pywrap_tensorflow_internal_static.vcxproj" (預設目標) (3) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_core_cpu.vcxproj" (預設目標) (126) ->
(ClCompile 目標) -> 
  C:\Users\User\Source\Repos\tensorflow\tensorflow/core/common_runtime/bfc_allocator.h(383): error C3861: '_BitScanReverse64': 找不到識別項 (正在編譯原始程式檔 C:\Users\User\Source\Repos\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc) [C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_core_cpu.vcxproj]
fo40225 commented 6 years ago

Because cc_test is broken, it cannot perform a full test.

I haven't used tensorflow's C++ API before, this built worked on the sample with target 32-bit exe.

Before using it, you should do a complete test yourself.

https://github.com/fo40225/tensorflow-windows-wheel/tree/master/1.7.0/cpp


Here is the way to build 32bit cpp lib.

env

win 10 x64 git 2.14.1 (in PATH) cmake 3.9.6 (in PATH) python (in PATH) visual studio 2017

get source

cd %HOMEPATH%
git clone https://github.com/tensorflow/tensorflow.git -b v1.7.0
cd tensorflow/tensorflow/contrib/cmake
mkdir build
cd build

apply fix

add add_definitions(-DEIGEN_DEFAULT_DENSE_INDEX_TYPE=std::int64_t) into tensorflow\tensorflow\contrib\cmake\CMakeLists.txt


edit tensorflow\tensorflow\contrib\cmake\tools\create_def_file.py line 128 from def_fp.write("\t ??1OpDef@tensorflow@@UEAA@XZ\n") to def_fp.write("\t ??1OpDef@tensorflow@@UAE@XZ\n")


edit tensorflow\tensorflow\core\common_runtime\bfc_allocator.h line 381 from

  inline int Log2FloorNonZero(uint64 n) {
#if defined(__GNUC__)
    return 63 ^ __builtin_clzll(n);
#elif defined(PLATFORM_WINDOWS)
    unsigned long index;
    _BitScanReverse64(&index, n);
    return index;
#else
    return Log2FloorNonZeroSlow(n);
#endif
  }

to

  inline int Log2FloorNonZero(uint64 n) {
#if defined(__GNUC__)
    return 63 ^ __builtin_clzll(n);
#elif defined(PLATFORM_WINDOWS) && defined(_WIN64)
    unsigned long index;
    _BitScanReverse64(&index, n);
    return index;
#else
    return Log2FloorNonZeroSlow(n);
#endif
  }

build

open VS2017 x64_x86 Cross Tools Command Prompt

cd %HOMEPATH%\tensorflow\tensorflow\contrib\cmake\build

cmake .. -G "Visual Studio 15 2017" -T host=x64 ^
-DCMAKE_BUILD_TYPE=Release ^
-Dtensorflow_BUILD_PYTHON_BINDINGS=OFF ^
-Dtensorflow_BUILD_SHARED_LIB=ON ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32"

cmake --build . --target tensorflow --config Release -- /fileLogger

You will get tensorflow libs in tensorflow\tensorflow\contrib\cmake\build\Release.

BraynStorm commented 6 years ago

Thanks for uploading. Although, for these to be useful, you need to include the whole content of the "CMAKE_INSTALL_PREFIX" directory (usually C:\Program Files\ or something) as it contains the generated and properly placed header files.

fo40225 commented 6 years ago

Updated, containing the properly placed header make it looks much better now.

BraynStorm commented 6 years ago

Perfect! Thank you very much for the quick response and going through the trouble of building it.

miek0tube commented 6 years ago

Applied all the fixed but still get the following error:

cmake .. -G "Visual Studio 15 2017" -T host=x64 -DCMAKE_BUILD_TYPE=Release -Dtensorflow_BUILD_PYTHON_BINDINGS=OFF -Dtensorflow_BUILD_SHARED_LIB=ON -Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32" ............. CMake Error at tf_tools.cmake:52 (list): list sub-command REMOVE_ITEM requires two or more arguments. Call Stack (most recent call first): CMakeLists.txt:463 (include)

Interestingly, running the build on TF 1.6.0.rc0 does not generate the error (but compilation fails anyway). Could you please take a look at this?

fo40225 commented 6 years ago

@miek0tube Try not turn tensorflow_BUILD_PYTHON_BINDINGS OFF, it has some reported issues.

miek0tube commented 6 years ago

Yeah, thanks! That worked!

miek0tube commented 6 years ago

To my disappointment, running in 32-bit mode is 6-8 times slower! Did you experience the same performance hit?

sleeplessai commented 6 years ago

@miek0tube Out of the topic, I am curious about the scene your TF will apply to. Could you share more?

fo40225 commented 6 years ago

I don't know if this changes affect perf a lot. https://github.com/tensorflow/tensorflow/blob/v1.9.0/tensorflow/core/common_runtime/bfc_allocator.h#L382

You can help to figure out the root cause. https://github.com/fo40225/tensorflow-windows-wheel/issues/20

miek0tube commented 6 years ago

I'm porting an image recognition software from Python to C++. It runs as a plugin to another software which is 32-bit C# application. About the bfc allocator patch -- I don't thin this could cause such a dramatic slowdown. That piece of code is fast enough with or without the patch.

miek0tube commented 6 years ago

Looks like I was partially blind this day. This key - -Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32" wipes the enhanced CPU instructions support. Changed it to "/arch:SSE2" and everything is fine now.