Closed Fabi1080 closed 7 years ago
I have the same problem: ./configure fails with the above exception. Trying to build anyways is successful, but I have to assign more memory for the build:
bazel build -c opt tensorflow_serving/... --local_resources 2048,1.0,1.0
Then, however, when I try to run I get the following error:
#> bazel-bin/tensorflow_serving/example/inception_export
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.variable_op_scope(values, name, default_name) is deprecated, use tf.variable_scope(name, default_name, values)
WARNING:tensorflow:VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 169, in <module>
tf.app.run()
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 165, in main
export()
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 79, in export
logits, _ = inception_model.inference(images, NUM_CLASSES + 1)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/inception_model.py", line 87, in inference
scope=scope)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/inception_model.py", line 87, in inception_v3
scope='conv0')
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/scopes.py", line 155, in func_with_args
return func(*args, **current_args)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/ops.py", line 228, in conv2d
restore=restore)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/scopes.py", line 155, in func_with_args
return func(*args, **current_args)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/variables.py", line 289, in variable
trainable=trainable, collections=collections)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 988, in get_variable
custom_getter=custom_getter)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 890, in get_variable
custom_getter=custom_getter)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 348, in get_variable
validate_shape=validate_shape)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 693, in _get_single_variable
loss = regularizer(v)
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/losses.py", line 71, in regularizer
return tf.mul(l2_weight, tf.nn.l2_loss(tensor), name='value')
AttributeError: 'module' object has no attribute 'mul'
I followed the same Docker instruction link, and got a very similar problem here too.
root@c81e0f0c8a2e:/serving/tensorflow# ./configure
Please specify the location of python. [Default is /usr/bin/python]:
Please specify optimization flags to use during compilation [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] y
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y
XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] n
No CUDA support will be enabled for TensorFlow
Configuration finished
INFO: Reading 'startup' options from /root/.bazelrc: --batch
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
INFO: Reading 'startup' options from /root/.bazelrc: --batch
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@iron_form_element_behavior' (requested by nodes 'REPOSITORY:@iron_form_element_behavior')
at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnector.connect(HttpConnector.java:124)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.establishConnection(HttpConnectorMultiplexer.java:296)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.connect(HttpConnectorMultiplexer.java:121)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:197)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:120)
at com.google.devtools.build.lib.bazel.repository.NewHttpArchiveFunction.fetch(NewHttpArchiveFunction.java:63)
at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
... 4 more
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@iron_form_element_behavior' (requested by nodes 'REPOSITORY:@iron_form_element_behavior')
at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnector.connect(HttpConnector.java:124)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.establishConnection(HttpConnectorMultiplexer.java:296)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.connect(HttpConnectorMultiplexer.java:121)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:197)
at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:120)
at com.google.devtools.build.lib.bazel.repository.NewHttpArchiveFunction.fetch(NewHttpArchiveFunction.java:63)
at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
... 4 more
The same story. Running Docker on MacOS Sierra via Docker for MacOS. Things that I tried with no success:
It seems that the bug came up recently, because the last week I was able to run ./configure
Ok, good point that it was running last week.
I went back in the git history using
git clone --recurse-submodules https://github.com/tensorflow/serving
cd serving/tensorflow
git checkout [an older commit from the configure file history](https://github.com/tensorflow/tensorflow/commits/b00fc538638f87ac45be9105057b9865f0f9418b/configure)
./configure
I found out that this is the commit that first introduced the exception.
@martinwicke do you have any idea what may cause this problem?
Going back further in the history returns a lot of errors that dependencies are no longer available.
I've got similar error on Ubuntu 16.04.1 LTS: everything builds just fine until ./configure
(all settings are left by default). Pastebin with a log. Note that first attempt is ended with java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jemalloc' (requested by nodes 'REPOSITORY:@jemalloc')
, and the second attempt of ./configure
is ended with java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jpeg' (requested by nodes 'REPOSITORY:@jpeg')
(jemalloc
vs jpeg
).
When I ran it several more times, I got similarly looking errors for nodes 'REPOSITORY_DIRECTORY:@llvm'
, and then for 'REPOSITORY_DIRECTORY:@jemalloc' again. Looks pretty arbitrary.
I've also seen a bunch of lines like this somewhere in Bazel's status during ./configure
:
INFO: Failed to connect to https://github.com/components/es6-promise/archive/v2.1.0.tar.gz trying again in 3,200ms
Maybe it's related.
I'm getting the same error on MacOS Sierra. I've tried in both Docker for Mac and using Docker Toolbox. Log in pastebin. I'm also getting the 'Failed to connect to' as encountered above.
This seems to be an issue loading https resources. Dependencies load fine via http but fail over https.
A A AINFO: Failed to connect to https://github.com/junit-team/junit4/releases/downl\ oad/r4.12/junit-4.12.jar trying again in 1,600ms A AINFO: Loading package: @swig// AINFO: Downloading http://bazel-mirror.storage.googleapis.com/github.com/llvm-m\ irror/llvm/archive/4e9e4f277ad254e02a0cff33c61cd827e600da62.tar.gz: 11,538,625\ bytes A A AINFO: Failed to connect to https://github.com/polymerelements/iron-form-elemen\ t-behavior/archive/v1.0.6.tar.gz trying again in 400ms A AINFO: Downloading http://bazel-mirror.storage.googleapis.com/github.com/llvm-m\ irror/llvm/archive/4e9e4f277ad254e02a0cff33c61cd827e600da62.tar.gz: 13,051,989\ bytes
I was able to get ./configure to work by running the following two commands (in the running Docker container). apt-get install ca-certificates-java update-ca-certificates -f
The issue I had was none of the github dependencies would load, as they used https transport.
I'm building from the Docker file at: tensorflow_serving/tools/docker/Dockerfile.devel This image is built using Ubuntu 14.04.5 LTS (per /etc/issue) and openjdk version "1.8.0_111" (in spite of the bazel install docs saying openjdk is not available for this version of Ubuntu) Host is: Docker for Mac. Also was successful in getting ./configure to work by installing the Oracle JDK and reinstalling bazel.
Thanks. Had a feeling it was a certificates issue.
apt-get install ca-certificates-java
update-ca-certificates -f
solved the problem for me.
Thanks,
apt-get install ca-certificates-java && update-ca-certificates -f
solved the Exception in ./configure for me.
I'm still stuck with the following, when running inception_export
...
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/losses.py", line 71, in regularizer
return tf.mul(l2_weight, tf.nn.l2_loss(tensor), name='value')
AttributeError: 'module' object has no attribute 'mul'
But I guess, that is another problem.
Yes, use multiply instead.
Can I configure inception to use multiply
instead of mul
? The call to mul happens from the inception code and not my own code. Is this an incompatibility issue between TensorFlow and Inception versions?
Currently, yes. Current inception may not be compatible with TF at head.
Alright, getting there. Can you advise on any compatible versions of TF and Inception?
FYI, we have a fix for the specific file here: https://github.com/tensorflow/models/commit/e5079c839058ff40dcbd15515a9cfb462fabbc2a#diff-1f23067444bd897aed5e55f64b63fe76, but the submodule tf_model hasn't been synced yet. When this pull request ( https://github.com/tensorflow/serving/pull/304) is merged, the file should be updated.
On Tue, Jan 31, 2017 at 9:30 AM, k-schreiber notifications@github.com wrote:
Alright, getting there. Can you advise on any compatible versions of TF and Inception?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-276432649, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBei2ytocGnmCwbFDPLqEYESGSoRWks5rX2-igaJpZM4LrE2J .
Changes merged and the submodule is now updated. Please give it a try.
Closing this issue since the original problem was resolved. If there are still any problems with the models submodule, please reopen in a new bug.
Problem seems still exist. Ubuntu 16.04 LTS, Raspberry PI. requested by nodes 'REPOSITORY:@jpeg/jemalloc
I can confirm, this issue seems resolved: ./configure and build runs smoothly, starting the server also works. ✅
Calling the server gives me another error FAILED_PRECONDITION, details="Default serving signature key not found."
- if there's still anybody here able to give a hint in the right direction, I would appreciate it:
bazel-bin/tensorflow_serving/example/inception_client --server localhost:9000 --image Picture.jpg
D0207 11:28:41.984146287 6541 ev_posix.c:101] Using polling engine: poll
Traceback (most recent call last):
File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in <module>
tf.app.run()
File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in main
result = stub.Predict(request, 10.0) # 10 secs timeout
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 300, in __call__
self._request_serializer, self._response_deserializer)
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 198, in _blocking_unary_unary
raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Default serving signature key not found.")
E0207 11:28:46.020743693 6541 chttp2_transport.c:1810] close_transport: {"created":"@1486466926.020710033","description":"FD shutdown","file":"src/core/lib/iomgr/ev_poll_posix.c","file_line":427}
How did you generate the export? We have changed inception_client.py to work with model export in SavedModel format (instead of the deprecated SessionBundle format). Can you remove the old exports first and then follow the instructions again?
On Tue, Feb 7, 2017 at 3:34 AM, k-schreiber notifications@github.com wrote:
I can confirm, this issue seems resolved: ./configure and build runs smoothly, starting the server also works. ✅
Calling the server gives me another error FAILED_PRECONDITION, details="Default serving signature key not found." - if there's still anybody here able to give a hint in the right direction, I would appreciate it:
bazel-bin/tensorflow_serving/example/inception_client --server localhost:9000 --image Picture.jpg D0207 11:28:41.984146287 6541 ev_posix.c:101] Using polling engine: poll Traceback (most recent call last): File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in
tf.app.run() File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in main result = stub.Predict(request, 10.0) # 10 secs timeout File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 300, in call self._request_serializer, self._response_deserializer) File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 198, in _blocking_unary_unary raise _abortion_error(rpc_error_call) grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Default serving signature key not found.") E0207 11:28:46.020743693 6541 chttp2_transport.c:1810] close_transport: {"created":"@1486466926.020710033","description":"FD shutdown","file":"src/core/lib/iomgr/ev_poll_posix.c","file_line":427} — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-277973335, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBSrfCrtKAwoHKj9VS0ojCcJCZNANks5raFa6gaJpZM4LrE2J .
Have you synced your repository with submodules? Can you give you more details on the error message?
On Fri, Feb 3, 2017 at 10:46 AM, 张羽飞 notifications@github.com wrote:
Problem seems still exist. Ubuntu 16.04 LTS, Raspberry PI. requested by nodes 'REPOSITORY:@jpeg/jemalloc
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-277328566, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBZPgaHLrWBCX43a0DFRZlsuHjtJgks5rY3YFgaJpZM4LrE2J .
@lilao
Hello,
I recently cannot reproduce previous error.
previously, I just followed samjabrahams' step by step settings,
[https://github.com/samjabrahams/tensorflow-on-raspberry-pi/blob/master/GUIDE.md#4-build-bazel. ]
And during the compiling, the error is the excatly same as the error in Fabi1080's reply.
However, another error appeared when I configure the build.
After ./configure, the system return an error of error loading package.
yufeiizhang@RaspberryPI-YFZhang:~/tf/tensorflow$ ./configure Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] n jemalloc disabled Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n No Google Cloud Platform support will be enabled for TensorFlow Do you wish to build TensorFlow with Hadoop File System support? [y/N] n No Hadoop File System support will be enabled for TensorFlow Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n No XLA JIT support will be enabled for TensorFlow nFound possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]Using python library path: /usr/local/lib/python3.5/dist-packages Do you wish to build TensorFlow with OpenCL support? [y/N] n No OpenCL support will be enabled for TensorFlow Do you wish to build TensorFlow with CUDA support? [y/N] n No CUDA support will be enabled for TensorFlow Configuration finished ................................................................................................................................................................................. INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes. .......................................................................................................................................................................... ERROR: package contains errors: tensorflow/compiler/tf2xla/kernels. ERROR: error loading package 'tensorflow/compiler/tf2xla/kernels': Encountered error while reading extension file 'protobuf.bzl': no such package '@protobuf//': Error downloading [http://bazel-mirror.storage.googleapis.com/github.com/google/protobuf/archive/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz, https://github.com/google/protobuf/archive/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz] to /home/yufeiizhang/.cache/bazel/_bazel_yufeiizhang/4e73af2737d6f680f5f982db4f6d678d/external/protobuf/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz: All mirrors are down: [GET returned 404 Not Found, java.lang.IllegalStateException]. yufeiizhang@RaspberryPI-YFZhang:~/tf/tensorflow$
Updating certificates didn't solve my problem, but upgrading bazel to 0.4.4 did. Thank you guys.
@k-schreiber, did you resolve your above issue of:
FAILED_PRECONDITION, details="Default serving signature key not found."
I am running into the same issue and wondering if you have resolved it. Thanks.
@brookwc are you running at head? I would recommend you sync TensorFlow Serving to head (along with its submodules) and follow the instructions in https://tensorflow.github.io/serving/serving_inception to export the model. Note that we recently switched the export to SavedModel from the old SessionBundle format, so you'll have to delete the old model if you exported before that. Using the latest code, the right signatures should be defined.
@kirilg you're right, syncing up with the latest head solved the issue. Thanks!
Hi, I'm on Mac OS and trying to get into the serving tutorials running a docker container. I am following these steps.
When running configure (using all default options) it stops with a Runtime Exception (see below).
This line seems to be a bit strange to me but i have no idea what to do:
Any help is appreciated!