Open BraynStorm opened 6 years ago
I have not tried to build for x86, maybe I can try to build.
Build fail, it seems to require some code fix.
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_python_build_pip_package.vcxproj" (預設目標) (1) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\pywrap_tensorflow_internal.vcxproj" (預設目標) (2) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\pywrap_tensorflow_internal_static.vcxproj" (預設目標) (3) ->
"C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_core_cpu.vcxproj" (預設目標) (126) ->
(ClCompile 目標) ->
C:\Users\User\Source\Repos\tensorflow\tensorflow/core/common_runtime/bfc_allocator.h(383): error C3861: '_BitScanReverse64': 找不到識別項 (正在編譯原始程式檔 C:\Users\User\Source\Repos\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc) [C:\Users\User\Source\Repos\tensorflow\tensorflow\contrib\cmake\build\tf_core_cpu.vcxproj]
Because cc_test is broken, it cannot perform a full test.
I haven't used tensorflow's C++ API before, this built worked on the sample with target 32-bit exe.
Before using it, you should do a complete test yourself.
https://github.com/fo40225/tensorflow-windows-wheel/tree/master/1.7.0/cpp
Here is the way to build 32bit cpp lib.
win 10 x64 git 2.14.1 (in PATH) cmake 3.9.6 (in PATH) python (in PATH) visual studio 2017
cd %HOMEPATH%
git clone https://github.com/tensorflow/tensorflow.git -b v1.7.0
cd tensorflow/tensorflow/contrib/cmake
mkdir build
cd build
add add_definitions(-DEIGEN_DEFAULT_DENSE_INDEX_TYPE=std::int64_t)
into tensorflow\tensorflow\contrib\cmake\CMakeLists.txt
edit tensorflow\tensorflow\contrib\cmake\tools\create_def_file.py
line 128
from
def_fp.write("\t ??1OpDef@tensorflow@@UEAA@XZ\n")
to
def_fp.write("\t ??1OpDef@tensorflow@@UAE@XZ\n")
edit tensorflow\tensorflow\core\common_runtime\bfc_allocator.h
line 381
from
inline int Log2FloorNonZero(uint64 n) {
#if defined(__GNUC__)
return 63 ^ __builtin_clzll(n);
#elif defined(PLATFORM_WINDOWS)
unsigned long index;
_BitScanReverse64(&index, n);
return index;
#else
return Log2FloorNonZeroSlow(n);
#endif
}
to
inline int Log2FloorNonZero(uint64 n) {
#if defined(__GNUC__)
return 63 ^ __builtin_clzll(n);
#elif defined(PLATFORM_WINDOWS) && defined(_WIN64)
unsigned long index;
_BitScanReverse64(&index, n);
return index;
#else
return Log2FloorNonZeroSlow(n);
#endif
}
open VS2017 x64_x86 Cross Tools Command Prompt
cd %HOMEPATH%\tensorflow\tensorflow\contrib\cmake\build
cmake .. -G "Visual Studio 15 2017" -T host=x64 ^
-DCMAKE_BUILD_TYPE=Release ^
-Dtensorflow_BUILD_PYTHON_BINDINGS=OFF ^
-Dtensorflow_BUILD_SHARED_LIB=ON ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32"
cmake --build . --target tensorflow --config Release -- /fileLogger
You will get tensorflow libs in tensorflow\tensorflow\contrib\cmake\build\Release
.
Thanks for uploading. Although, for these to be useful, you need to include the whole content of the "CMAKE_INSTALL_PREFIX" directory (usually C:\Program Files\ or something) as it contains the generated and properly placed header files.
Updated, containing the properly placed header make it looks much better now.
Perfect! Thank you very much for the quick response and going through the trouble of building it.
Applied all the fixed but still get the following error:
cmake .. -G "Visual Studio 15 2017" -T host=x64 -DCMAKE_BUILD_TYPE=Release -Dtensorflow_BUILD_PYTHON_BINDINGS=OFF -Dtensorflow_BUILD_SHARED_LIB=ON -Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32" ............. CMake Error at tf_tools.cmake:52 (list): list sub-command REMOVE_ITEM requires two or more arguments. Call Stack (most recent call first): CMakeLists.txt:463 (include)
Interestingly, running the build on TF 1.6.0.rc0 does not generate the error (but compilation fails anyway). Could you please take a look at this?
@miek0tube Try not turn tensorflow_BUILD_PYTHON_BINDINGS OFF, it has some reported issues.
Yeah, thanks! That worked!
To my disappointment, running in 32-bit mode is 6-8 times slower! Did you experience the same performance hit?
@miek0tube Out of the topic, I am curious about the scene your TF will apply to. Could you share more?
I don't know if this changes affect perf a lot. https://github.com/tensorflow/tensorflow/blob/v1.9.0/tensorflow/core/common_runtime/bfc_allocator.h#L382
You can help to figure out the root cause. https://github.com/fo40225/tensorflow-windows-wheel/issues/20
I'm porting an image recognition software from Python to C++. It runs as a plugin to another software which is 32-bit C# application. About the bfc allocator patch -- I don't thin this could cause such a dramatic slowdown. That piece of code is fast enough with or without the patch.
Looks like I was partially blind this day. This key - -Dtensorflow_WIN_CPU_SIMD_OPTIONS="/arch:IA32" wipes the enhanced CPU instructions support. Changed it to "/arch:SSE2" and everything is fine now.
Title. Also, I see that you build with SSE too, does this mean you can build TF as a 32bit binary?