When building OpenVINO Tokenziers with -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_CXX_FLAGS="-stdlib=libc++" -DCMAKE_EXE_LINKER_FLAGS="-stdlib=libc++ -lc++abi" -DCMAKE_SHARED_LINKER_FLAGS="-stdlib=libc++ -lc++abi" and -DBUILD_FAST_TOKENIZERS=ON, OpenVINO Tokenizers fails to build.
For a deeper look, the reason is that fast_tokenizer, a dependency of OpenVINO Tokenizers, cannot be built with Clang. The issue (https://github.com/PaddlePaddle/PaddleNLP/issues/8565) has been filed in PaddleNLP.
To make a workaround, I changed src/icu4c.patch to
to remove the hardcoded GCC compiler. The patch works well in fast_tokenzier, but not in OpenVINO Tokenizers. The build process looks the same without the patch, and the dependency CMake downloaded remains the same after patching. It can be checked via cat fast_tokenizer/cmake/external/icu.cmake | grep shared in the CMake build directory.
Context
When building OpenVINO Tokenziers with
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_CXX_FLAGS="-stdlib=libc++" -DCMAKE_EXE_LINKER_FLAGS="-stdlib=libc++ -lc++abi" -DCMAKE_SHARED_LINKER_FLAGS="-stdlib=libc++ -lc++abi"
and-DBUILD_FAST_TOKENIZERS=ON
, OpenVINO Tokenizers fails to build.For a deeper look, the reason is that fast_tokenizer, a dependency of OpenVINO Tokenizers, cannot be built with Clang. The issue (https://github.com/PaddlePaddle/PaddleNLP/issues/8565) has been filed in PaddleNLP.
To make a workaround, I changed
src/icu4c.patch
toto remove the hardcoded GCC compiler. The patch works well in fast_tokenzier, but not in OpenVINO Tokenizers. The build process looks the same without the patch, and the dependency CMake downloaded remains the same after patching. It can be checked via
cat fast_tokenizer/cmake/external/icu.cmake | grep shared
in the CMake build directory.What needs to be done?
Example Pull Requests
No response
Resources
Contact points
@ilya-lavrenov
Ticket
No response