tensorflow / serving

A flexible, high-performance serving system for machine learning models
https://www.tensorflow.org/serving
Apache License 2.0
6.18k stars 2.19k forks source link

configure in Docker container ends with java.lang.RuntimeException: Unrecoverable error while evaluating node #301

Closed Fabi1080 closed 7 years ago

Fabi1080 commented 7 years ago

Hi, I'm on Mac OS and trying to get into the serving tutorials running a docker container. I am following these steps.

When running configure (using all default options) it stops with a Runtime Exception (see below).

This line seems to be a bit strange to me but i have no idea what to do:

REPOSITORY_DIRECTORY:@jpeg' (requested by nodes 'REPOSITORY:@jpeg

Any help is appreciated!

root@a9c8b25976dd:/serving/tensorflow# ./configure 
Please specify the location of python. [Default is /usr/bin/python]: 
Please specify optimization flags to use during compilation [Default is -march=native]: 
Do you wish to use jemalloc as the malloc implementation? [Y/n] 
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] 
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] 
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] 
No XLA support will be enabled for TensorFlow
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] 
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] 
No CUDA support will be enabled for TensorFlow
Configuration finished
INFO: Reading 'startup' options from /root/.bazelrc: --batch
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.

INFO: Reading 'startup' options from /root/.bazelrc: --batch
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jpeg' (requested by nodes 'REPOSITORY:@jpeg')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Invalid EvalException:
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:196)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryContext.downloadAndExtract(SkylarkRepositoryContext.java:594)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.google.devtools.build.lib.syntax.FuncallExpression.callMethod(FuncallExpression.java:316)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:732)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:784)
    at com.google.devtools.build.lib.syntax.FuncallExpression.doEval(FuncallExpression.java:770)
    at com.google.devtools.build.lib.syntax.Expression.eval(Expression.java:48)
    at com.google.devtools.build.lib.syntax.ExpressionStatement.doExec(ExpressionStatement.java:46)
    at com.google.devtools.build.lib.syntax.Statement.exec(Statement.java:37)
    at com.google.devtools.build.lib.syntax.UserDefinedFunction.call(UserDefinedFunction.java:136)
    at com.google.devtools.build.lib.syntax.BaseFunction.call(BaseFunction.java:439)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryFunction.fetch(SkylarkRepositoryFunction.java:106)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    at com.google.devtools.build.lib.syntax.EvalException.<init>(EvalException.java:112)
    at com.google.devtools.build.lib.syntax.EvalException$EvalExceptionWithJavaCause.<init>(EvalException.java:209)
    at com.google.devtools.build.lib.syntax.EvalException$EvalExceptionWithJavaCause.<init>(EvalException.java:217)
    at com.google.devtools.build.lib.syntax.FuncallExpression.callMethod(FuncallExpression.java:344)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:732)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:784)
    at com.google.devtools.build.lib.syntax.FuncallExpression.doEval(FuncallExpression.java:770)
    at com.google.devtools.build.lib.syntax.Expression.eval(Expression.java:48)
    at com.google.devtools.build.lib.syntax.ExpressionStatement.doExec(ExpressionStatement.java:46)
    at com.google.devtools.build.lib.syntax.Statement.exec(Statement.java:37)
    at com.google.devtools.build.lib.syntax.UserDefinedFunction.call(UserDefinedFunction.java:136)
    at com.google.devtools.build.lib.syntax.BaseFunction.call(BaseFunction.java:439)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryFunction.fetch(SkylarkRepositoryFunction.java:106)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    ... 4 more
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jpeg' (requested by nodes 'REPOSITORY:@jpeg')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Invalid EvalException:
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:196)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryContext.downloadAndExtract(SkylarkRepositoryContext.java:594)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.google.devtools.build.lib.syntax.FuncallExpression.callMethod(FuncallExpression.java:316)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:732)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:784)
    at com.google.devtools.build.lib.syntax.FuncallExpression.doEval(FuncallExpression.java:770)
    at com.google.devtools.build.lib.syntax.Expression.eval(Expression.java:48)
    at com.google.devtools.build.lib.syntax.ExpressionStatement.doExec(ExpressionStatement.java:46)
    at com.google.devtools.build.lib.syntax.Statement.exec(Statement.java:37)
    at com.google.devtools.build.lib.syntax.UserDefinedFunction.call(UserDefinedFunction.java:136)
    at com.google.devtools.build.lib.syntax.BaseFunction.call(BaseFunction.java:439)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryFunction.fetch(SkylarkRepositoryFunction.java:106)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    at com.google.devtools.build.lib.syntax.EvalException.<init>(EvalException.java:112)
    at com.google.devtools.build.lib.syntax.EvalException$EvalExceptionWithJavaCause.<init>(EvalException.java:209)
    at com.google.devtools.build.lib.syntax.EvalException$EvalExceptionWithJavaCause.<init>(EvalException.java:217)
    at com.google.devtools.build.lib.syntax.FuncallExpression.callMethod(FuncallExpression.java:344)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:732)
    at com.google.devtools.build.lib.syntax.FuncallExpression.invokeObjectMethod(FuncallExpression.java:784)
    at com.google.devtools.build.lib.syntax.FuncallExpression.doEval(FuncallExpression.java:770)
    at com.google.devtools.build.lib.syntax.Expression.eval(Expression.java:48)
    at com.google.devtools.build.lib.syntax.ExpressionStatement.doExec(ExpressionStatement.java:46)
    at com.google.devtools.build.lib.syntax.Statement.exec(Statement.java:37)
    at com.google.devtools.build.lib.syntax.UserDefinedFunction.call(UserDefinedFunction.java:136)
    at com.google.devtools.build.lib.syntax.BaseFunction.call(BaseFunction.java:439)
    at com.google.devtools.build.lib.bazel.repository.skylark.SkylarkRepositoryFunction.fetch(SkylarkRepositoryFunction.java:106)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    ... 4 more
k-schreiber commented 7 years ago

I have the same problem: ./configure fails with the above exception. Trying to build anyways is successful, but I have to assign more memory for the build: bazel build -c opt tensorflow_serving/... --local_resources 2048,1.0,1.0

Then, however, when I try to run I get the following error:

#> bazel-bin/tensorflow_serving/example/inception_export

WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.variable_op_scope(values, name, default_name) is deprecated, use tf.variable_scope(name, default_name, values)
WARNING:tensorflow:VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 169, in <module>
    tf.app.run()
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 165, in main
    export()
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/tf_serving/tensorflow_serving/example/inception_export.py", line 79, in export
    logits, _ = inception_model.inference(images, NUM_CLASSES + 1)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/inception_model.py", line 87, in inference
    scope=scope)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/inception_model.py", line 87, in inception_v3
    scope='conv0')
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/scopes.py", line 155, in func_with_args
    return func(*args, **current_args)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/ops.py", line 228, in conv2d
    restore=restore)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/scopes.py", line 155, in func_with_args
    return func(*args, **current_args)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/variables.py", line 289, in variable
    trainable=trainable, collections=collections)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 988, in get_variable
    custom_getter=custom_getter)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 890, in get_variable
    custom_getter=custom_getter)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 348, in get_variable
    validate_shape=validate_shape)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/org_tensorflow/tensorflow/python/ops/variable_scope.py", line 693, in _get_single_variable
    loss = regularizer(v)
  File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/losses.py", line 71, in regularizer
    return tf.mul(l2_weight, tf.nn.l2_loss(tensor), name='value')
AttributeError: 'module' object has no attribute 'mul'
crystalclear506 commented 7 years ago

I followed the same Docker instruction link, and got a very similar problem here too.

root@c81e0f0c8a2e:/serving/tensorflow# ./configure
Please specify the location of python. [Default is /usr/bin/python]:
Please specify optimization flags to use during compilation [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] y
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y
XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] n
No CUDA support will be enabled for TensorFlow
Configuration finished
INFO: Reading 'startup' options from /root/.bazelrc: --batch
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
INFO: Reading 'startup' options from /root/.bazelrc: --batch
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@iron_form_element_behavior' (requested by nodes 'REPOSITORY:@iron_form_element_behavior')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnector.connect(HttpConnector.java:124)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.establishConnection(HttpConnectorMultiplexer.java:296)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.connect(HttpConnectorMultiplexer.java:121)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:197)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:120)
    at com.google.devtools.build.lib.bazel.repository.NewHttpArchiveFunction.fetch(NewHttpArchiveFunction.java:63)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    ... 4 more
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@iron_form_element_behavior' (requested by nodes 'REPOSITORY:@iron_form_element_behavior')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:429)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnector.connect(HttpConnector.java:124)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.establishConnection(HttpConnectorMultiplexer.java:296)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpConnectorMultiplexer.connect(HttpConnectorMultiplexer.java:121)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:197)
    at com.google.devtools.build.lib.bazel.repository.downloader.HttpDownloader.download(HttpDownloader.java:120)
    at com.google.devtools.build.lib.bazel.repository.NewHttpArchiveFunction.fetch(NewHttpArchiveFunction.java:63)
    at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:155)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:370)
    ... 4 more
vaskinyy commented 7 years ago

The same story. Running Docker on MacOS Sierra via Docker for MacOS. Things that I tried with no success:

It seems that the bug came up recently, because the last week I was able to run ./configure

Fabi1080 commented 7 years ago

Ok, good point that it was running last week.

I went back in the git history using

git clone --recurse-submodules https://github.com/tensorflow/serving
cd serving/tensorflow
git checkout [an older commit from the configure file history](https://github.com/tensorflow/tensorflow/commits/b00fc538638f87ac45be9105057b9865f0f9418b/configure)
./configure

I found out that this is the commit that first introduced the exception.

@martinwicke do you have any idea what may cause this problem?

Going back further in the history returns a lot of errors that dependencies are no longer available.

yeputons commented 7 years ago

I've got similar error on Ubuntu 16.04.1 LTS: everything builds just fine until ./configure (all settings are left by default). Pastebin with a log. Note that first attempt is ended with java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jemalloc' (requested by nodes 'REPOSITORY:@jemalloc'), and the second attempt of ./configure is ended with java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@jpeg' (requested by nodes 'REPOSITORY:@jpeg') (jemalloc vs jpeg).

When I ran it several more times, I got similarly looking errors for nodes 'REPOSITORY_DIRECTORY:@llvm', and then for 'REPOSITORY_DIRECTORY:@jemalloc' again. Looks pretty arbitrary.

I've also seen a bunch of lines like this somewhere in Bazel's status during ./configure: INFO: Failed to connect to https://github.com/components/es6-promise/archive/v2.1.0.tar.gz trying again in 3,200ms Maybe it's related.

aerrity commented 7 years ago

I'm getting the same error on MacOS Sierra. I've tried in both Docker for Mac and using Docker Toolbox. Log in pastebin. I'm also getting the 'Failed to connect to' as encountered above.

aerrity commented 7 years ago

This seems to be an issue loading https resources. Dependencies load fine via http but fail over https.

A A AINFO: Failed to connect to https://github.com/junit-team/junit4/releases/downl\ oad/r4.12/junit-4.12.jar trying again in 1,600ms A AINFO: Loading package: @swig// AINFO: Downloading http://bazel-mirror.storage.googleapis.com/github.com/llvm-m\ irror/llvm/archive/4e9e4f277ad254e02a0cff33c61cd827e600da62.tar.gz: 11,538,625\ bytes A A AINFO: Failed to connect to https://github.com/polymerelements/iron-form-elemen\ t-behavior/archive/v1.0.6.tar.gz trying again in 400ms A AINFO: Downloading http://bazel-mirror.storage.googleapis.com/github.com/llvm-m\ irror/llvm/archive/4e9e4f277ad254e02a0cff33c61cd827e600da62.tar.gz: 13,051,989\ bytes

mountaintom commented 7 years ago

I was able to get ./configure to work by running the following two commands (in the running Docker container). apt-get install ca-certificates-java update-ca-certificates -f

The issue I had was none of the github dependencies would load, as they used https transport.

I'm building from the Docker file at: tensorflow_serving/tools/docker/Dockerfile.devel This image is built using Ubuntu 14.04.5 LTS (per /etc/issue) and openjdk version "1.8.0_111" (in spite of the bazel install docs saying openjdk is not available for this version of Ubuntu) Host is: Docker for Mac. Also was successful in getting ./configure to work by installing the Oracle JDK and reinstalling bazel.

aerrity commented 7 years ago

Thanks. Had a feeling it was a certificates issue.

apt-get install ca-certificates-java update-ca-certificates -f

solved the problem for me.

k-schreiber commented 7 years ago

Thanks,

apt-get install ca-certificates-java && update-ca-certificates -f solved the Exception in ./configure for me.

I'm still stuck with the following, when running inception_export

...
File "/serving/bazel-bin/tensorflow_serving/example/inception_export.runfiles/inception_model/inception/slim/losses.py", line 71, in regularizer
    return tf.mul(l2_weight, tf.nn.l2_loss(tensor), name='value')
AttributeError: 'module' object has no attribute 'mul'

But I guess, that is another problem.

martinwicke commented 7 years ago

Yes, use multiply instead.

k-schreiber commented 7 years ago

Can I configure inception to use multiply instead of mul? The call to mul happens from the inception code and not my own code. Is this an incompatibility issue between TensorFlow and Inception versions?

martinwicke commented 7 years ago

Currently, yes.​ Current inception may not be compatible with TF at head.

k-schreiber commented 7 years ago

Alright, getting there. Can you advise on any compatible versions of TF and Inception?

lilao commented 7 years ago

FYI, we have a fix for the specific file here: https://github.com/tensorflow/models/commit/e5079c839058ff40dcbd15515a9cfb462fabbc2a#diff-1f23067444bd897aed5e55f64b63fe76, but the submodule tf_model hasn't been synced yet. When this pull request ( https://github.com/tensorflow/serving/pull/304) is merged, the file should be updated.

On Tue, Jan 31, 2017 at 9:30 AM, k-schreiber notifications@github.com wrote:

Alright, getting there. Can you advise on any compatible versions of TF and Inception?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-276432649, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBei2ytocGnmCwbFDPLqEYESGSoRWks5rX2-igaJpZM4LrE2J .

kirilg commented 7 years ago

Changes merged and the submodule is now updated. Please give it a try.

Closing this issue since the original problem was resolved. If there are still any problems with the models submodule, please reopen in a new bug.

yufeiizhang commented 7 years ago

Problem seems still exist. Ubuntu 16.04 LTS, Raspberry PI. requested by nodes 'REPOSITORY:@jpeg/jemalloc

k-schreiber commented 7 years ago

I can confirm, this issue seems resolved: ./configure and build runs smoothly, starting the server also works. ✅

Calling the server gives me another error FAILED_PRECONDITION, details="Default serving signature key not found." - if there's still anybody here able to give a hint in the right direction, I would appreciate it:

bazel-bin/tensorflow_serving/example/inception_client --server localhost:9000 --image Picture.jpg 
D0207 11:28:41.984146287    6541 ev_posix.c:101]             Using polling engine: poll
Traceback (most recent call last):
  File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in <module>
    tf.app.run()
  File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in main
    result = stub.Predict(request, 10.0)  # 10 secs timeout
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 300, in __call__
    self._request_serializer, self._response_deserializer)
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 198, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Default serving signature key not found.")
E0207 11:28:46.020743693    6541 chttp2_transport.c:1810]    close_transport: {"created":"@1486466926.020710033","description":"FD shutdown","file":"src/core/lib/iomgr/ev_poll_posix.c","file_line":427}
lilao commented 7 years ago

How did you generate the export? We have changed inception_client.py to work with model export in SavedModel format (instead of the deprecated SessionBundle format). Can you remove the old exports first and then follow the instructions again?

On Tue, Feb 7, 2017 at 3:34 AM, k-schreiber notifications@github.com wrote:

I can confirm, this issue seems resolved: ./configure and build runs smoothly, starting the server also works. ✅

Calling the server gives me another error FAILED_PRECONDITION, details="Default serving signature key not found." - if there's still anybody here able to give a hint in the right direction, I would appreciate it:

bazel-bin/tensorflow_serving/example/inception_client --server localhost:9000 --image Picture.jpg D0207 11:28:41.984146287 6541 ev_posix.c:101] Using polling engine: poll Traceback (most recent call last): File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in tf.app.run() File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in main result = stub.Predict(request, 10.0) # 10 secs timeout File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 300, in call self._request_serializer, self._response_deserializer) File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 198, in _blocking_unary_unary raise _abortion_error(rpc_error_call) grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Default serving signature key not found.") E0207 11:28:46.020743693 6541 chttp2_transport.c:1810] close_transport: {"created":"@1486466926.020710033","description":"FD shutdown","file":"src/core/lib/iomgr/ev_poll_posix.c","file_line":427}

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-277973335, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBSrfCrtKAwoHKj9VS0ojCcJCZNANks5raFa6gaJpZM4LrE2J .

lilao commented 7 years ago

Have you synced your repository with submodules? Can you give you more details on the error message?

On Fri, Feb 3, 2017 at 10:46 AM, 张羽飞 notifications@github.com wrote:

Problem seems still exist. Ubuntu 16.04 LTS, Raspberry PI. requested by nodes 'REPOSITORY:@jpeg/jemalloc

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/301#issuecomment-277328566, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjoBZPgaHLrWBCX43a0DFRZlsuHjtJgks5rY3YFgaJpZM4LrE2J .

yufeiizhang commented 7 years ago

@lilao Hello, I recently cannot reproduce previous error.
previously, I just followed samjabrahams' step by step settings, [https://github.com/samjabrahams/tensorflow-on-raspberry-pi/blob/master/GUIDE.md#4-build-bazel. ] And during the compiling, the error is the excatly same as the error in Fabi1080's reply. However, another error appeared when I configure the build. After ./configure, the system return an error of error loading package.

yufeiizhang@RaspberryPI-YFZhang:~/tf/tensorflow$ ./configure Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] n jemalloc disabled Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n No Google Cloud Platform support will be enabled for TensorFlow Do you wish to build TensorFlow with Hadoop File System support? [y/N] n No Hadoop File System support will be enabled for TensorFlow Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n No XLA JIT support will be enabled for TensorFlow nFound possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]

Using python library path: /usr/local/lib/python3.5/dist-packages Do you wish to build TensorFlow with OpenCL support? [y/N] n No OpenCL support will be enabled for TensorFlow Do you wish to build TensorFlow with CUDA support? [y/N] n No CUDA support will be enabled for TensorFlow Configuration finished ................................................................................................................................................................................. INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes. .......................................................................................................................................................................... ERROR: package contains errors: tensorflow/compiler/tf2xla/kernels. ERROR: error loading package 'tensorflow/compiler/tf2xla/kernels': Encountered error while reading extension file 'protobuf.bzl': no such package '@protobuf//': Error downloading [http://bazel-mirror.storage.googleapis.com/github.com/google/protobuf/archive/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz, https://github.com/google/protobuf/archive/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz] to /home/yufeiizhang/.cache/bazel/_bazel_yufeiizhang/4e73af2737d6f680f5f982db4f6d678d/external/protobuf/9d3288e651700f3d52e6b4ead2a9f9ab02da53f4.tar.gz: All mirrors are down: [GET returned 404 Not Found, java.lang.IllegalStateException]. yufeiizhang@RaspberryPI-YFZhang:~/tf/tensorflow$

brianlan commented 7 years ago

Updating certificates didn't solve my problem, but upgrading bazel to 0.4.4 did. Thank you guys.

brookwc commented 7 years ago

@k-schreiber, did you resolve your above issue of:

FAILED_PRECONDITION, details="Default serving signature key not found."

I am running into the same issue and wondering if you have resolved it. Thanks.

kirilg commented 7 years ago

@brookwc are you running at head? I would recommend you sync TensorFlow Serving to head (along with its submodules) and follow the instructions in https://tensorflow.github.io/serving/serving_inception to export the model. Note that we recently switched the export to SavedModel from the old SessionBundle format, so you'll have to delete the old model if you exported before that. Using the latest code, the right signatures should be defined.

brookwc commented 7 years ago

@kirilg you're right, syncing up with the latest head solved the issue. Thanks!