tensorflow / serving

A flexible, high-performance serving system for machine learning models
https://www.tensorflow.org/serving
Apache License 2.0
6.16k stars 2.19k forks source link

`No module named 'tensorflow.compat` when trying to run experimental remote_predict op. #2091

Open rrkarim opened 1 year ago

rrkarim commented 1 year ago

Bug Report

System information

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow Serving.

Following README in remote predict op. The git commit and sha256 are the same as in master. I run:

tools/run_in_docker.sh bazel run tensorflow_serving/experimental/example:half_plus_two_with_rpop -- --target_address=localhost:8500

I get ModuleNotFoundError: No module named 'tensorflow.compat' in remote_predict_ops.py file. Then I try to remove compat and write everything with v2 api which gives abi issues:

ImportError: /home/rasulkarimov/serving/.cache/_bazel_root/0bdcf8b08a5256a78e00fcc7b2c20a7c/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/experimental/example/half_plus_two_with_rpop.runfiles/org_tensorflow/tensorflow/python/../libtensorflow_framework.so.2: undefined symbol: _ZTIN10tensorflow7SessionE

Is there a commit that I can checkout and reproduce the example, would appreciate it.

Additionally I have to manually symlink libtensorflow_framework.2.11.0 to libtensorflow_framework.2.

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

tensorflow/ops/remote_predict/python/ops/remote_predict_ops.py", line 23, in <module>
    import tensorflow.compat.v1 as tf
ModuleNotFoundError: No module named 'tensorflow.compat'
singhniraj08 commented 1 year ago

@rrkarim,

I have found a similar issue and commenting out

#from tensorflow.contrib.image.python.ops.single_image_random_dot_stereograms import single_image_random_dot_stereograms

in bazel-bin/tensorflow_serving/example/mnist_saved_model.runfiles/org_tensorflow/tensorflow/contrib/image/__init__.py resolves the issue. Please try this and let us know if it works. Thank you!

Also, experimental indicates that the said class/method is in early development, incomplete, or less commonly, not up-to-standards. It's a collection of user contributions which weren't yet integrated w/ main TensorFlow, but are still available as a part of open-source for users to test.

rrkarim commented 1 year ago

Hey @singhniraj08 I understand that it is experimental feature. That is why I'm asking for the last commit where the feature was tested successfully. Also I'm not sure how the issue you linked is related to this issue.

singhniraj08 commented 1 year ago

@rrkarim,

I think below are the commits you are looking for. Thanks!

Add an example which call the Remote Predict Op directly.(commit: d5b980f487996aa1f890a559eae968735dfebf5d) Added abstract layer for remote predict op over different RPC protocols with template.(commit: c54ca7ec95928b6eec39f350140835ebbe3caeb0)

rrkarim commented 1 year ago

@singhniraj08 thanks. I will check them and update here then.

rrkarim commented 1 year ago

@singhniraj08 no success with those commits as well. Bazel refuses to build on some commits. I also tried more recent commits which altered remote ops directly and I'm still getting: ModuleNotFoundError: No module named 'tensorflow.compat'. Why would I get this error. Some bazel targets miss tf dependencies maybe?