tensorflow / recommenders-addons

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
Apache License 2.0
587 stars 132 forks source link

bazel version Conflict when build tensorflow/servering with recommender-addons #452

Closed uzhy1987 closed 1 month ago

uzhy1987 commented 1 month ago

the latest version of recommenders-addons (v0.7.2) use Bazel 5.1.1, while the tensorflow/serving(v2.15.1) use Bazel 6.4.0, how can I build a tensorflow/serving with recommenders-addons ops

uzhy1987 commented 1 month ago

I try the method , but it not work.

I build recommenders-addons (v0.7.2) and copy _cuckoo_hashtable_ops.so to tensorflow/serving:2.15.1 docker image, and preload the so libs, as shown bellow, but it fail to exec

LD_PRELOAD="/usr/local/lib/libtensorflow_framework.so.2 /usr/local/lib/_cuckoo_hashtable_ops.so /usr/local/lib/_math_ops.so" tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"

alykhantejani commented 1 month ago

Hmm Im also trying to follow the README for building tf serving with tfra ops but the last push of tfra/serving is 2.8.3 which is probably too old as it's got bazel version: 3.7.2

alykhantejani commented 1 month ago

@uzhy1987 please update if you find anything as I am also stuck here

uzhy1987 commented 1 month ago

I try to build tensorflow/serving 2.11.1 with tfra 0.7.2, they both use Bazel 5.1.1. got succ

cd recommenders-addons python configure.py cp .bazelrc tools/serving_padding/.bazelrc_padding

alykhantejani commented 1 month ago

Ok so this is what im doing:

docker run -it --network=host --entrypoint /bin/bash tensorflow/serving:2.11.1-devel

cd /home
git clone https://github.com/tensorflow/recommenders-addons.git
cd recommenders-addons
python configure.py
cp .bazelrc tools/serving_padding/.bazelrc_padding

export SERVING_WITH_GPU=0 
export TFRA_BRANCH="master"
export TF_SERVING_BRANCH="r2.11.1"

export TFRA_SERVING_WORKSPACE=/home/
mkdir -p $TFRA_SERVING_WORKSPACE && cd $TFRA_SERVING_WORKSPACE
git clone -b $TF_SERVING_BRANCH https://github.com/tensorflow/serving.git

cd $TFRA_SERVING_WORKSPACE/recommenders-addons/tools
bash config_tfserving.sh $TFRA_BRANCH $TFRA_SERVING_WORKSPACE/serving $SERVING_WITH_GPU
cd $TFRA_SERVING_WORKSPACE/serving

 bazel build tensorflow_recommenders_addons/dynamic_embedding/core:_cuckoo_hashtable_ops.so

Because this is giving me an error:

ERROR: Traceback (most recent call last):
    File "/home/serving/tensorflow_recommenders_addons/tensorflow_recommenders_addons.bzl", line 10, column 6, in <toplevel>
        "cuda_is_configured",

however if I run this build command from ../recommender-addons it builds. Which leads me to think its a workspace issue with tf serving.

can you share your WORKSPACE file?

Also maybe @rhdong might be able to assist?

alykhantejani commented 1 month ago

ok well if I remove cuda_is_configured it seems to work. But this feels hacky (but I am building without GPU)

alykhantejani commented 1 month ago

@MoFHeka perhaps you could help?

MoFHeka commented 1 month ago

Try to delete all Windows-related code in the bazel file, such as tensorflow_recommenders_addons/tensorflow_recommenders_addons.bzl:44

uzhy1987 commented 1 month ago

I use the docker image tensorflow/serving:2.11.1-devel, tensorflow/serving 2.11.1 with tfra 0.7.2. I change .bazelrc_padding and replace cuda_is_configured() with if_cuda_is_configured(cuda_deps).

docker run -it --network=host -v /data/tfra:/data/tfra -v /tmp:/tmp tensorflow/serving:2.11.1-devel /bin/bash cd /data/tfra/serving; TEST_TMPDIR=.cache bazel build tensorflow_serving/model_servers:tensorflow_model_server

the .bazelrc_padding change as follow: build --action_env TF_HEADER_DIR="/usr/local/lib/python3.9/dist-packages/tensorflow/include" build --action_env TF_SHARED_LIBRARY_DIR="/usr/local/lib/python3.9/dist-packages/tensorflow" build --action_env TF_SHARED_LIBRARY_NAME="libtensorflow_framework.so.2" build --action_env TF_CXX11_ABI_FLAG="0" build --action_env TF_CXX_STANDARD="c++17" build --action_env TF_VERSION_INTEGER="2111" build --action_env FOR_TF_SERVING="1" build --spawn_strategy=standalone build --strategy=Genrule=standalone build -c opt build --copt=-mavx

MoFHeka commented 1 month ago

@uzhy1987 @alykhantejani TFRA also support compiling with Bazel 6.4.0 if remove those windows things. So there is no need to downgrade the TF version. Just replace Bazel version relative code, it would be OK.

alykhantejani commented 1 month ago

Ok So Im doing it with 2.15.1 and bazel 6.4.0. Got it almost working but linking is failing with linking TSL:


error: undefined reference to 'tsl::TfCheckOpHelperOutOfLine
``` etc. 

Any ideas? Did you also face this? I have set `TF_CXX11_ABI_FLAG=0`
alykhantejani commented 1 month ago

ok turns out my installed tf version wasn't correct. I can send a PR with the bazel fixes to upgrade to 6.4.0