Closed ewirbel closed 2 years ago
TF Serving does not have support for aarch64
architecture (in BUILD system, and i suspect the code might need changes too).
TF (core) afaik has support for aarch64
for the lite ecosystem:
https://www.tensorflow.org/lite/guide/build_arm64
Happy to accept patches to add aarch64
support to TF code base.
There is TF SIG Build too, to see what others are doing regarding aarch64 builds. Try asking on their mailing list.
TensorFlow Serving builds quite nicely on Jetson devices nowadays - have a look at https://github.com/helmuthva/jetson/tree/master/workflow/deploy/tensorflow-serving-base/src or https://github.com/helmuthva/jetson for the bigger picture of this project.
Docker images to get TensorFlow Serving up and running on Jetson Nano and Jetson AGX Xavier devices are now published on DockerHub - see https://hub.docker.com/u/helmuthva
To allow GPU access from inside the container the following devices have to be mounted when running the container:
Docker images to get TensorFlow Serving up and running on Jetson Nano and Jetson AGX Xavier devices are now published on DockerHub - see https://hub.docker.com/u/helmuthva
To allow GPU access from inside the container the following devices have to be mounted when running the container:
- /dev/nvhost-ctrl
- /dev/nvhost-ctrl-gpu
- /dev/nvhost-prof-gpu
- /dev/nvmap
- /dev/nvhost-gpu
- /dev/nvhost-as-gpu
Hi! I want to use tensorflow serving on my Jetson TX2. I've successfully pulled docker image and tried to create a new container with all these devices mounting. When the container starts, the RAM is filled by more than 90% and I get messages in the container logs about the lack of RAM. The query execution time fluctuates about 5 seconds, which is a lot for me. When using tensorflow serving on a weaker computer without a GPU, I get a runtime of about 0.5-1 second. What am I doing wrong? Please help me
Did you figure this out, @deaffella ? I'm planning to do that if I can, to make it simpler to serve my a model on my tx2. Basically the issue I have with the images Helmut is referring to above are too large for my device (which already has some things on it and the images uses 6+GB). Trying to build with bazel, I get this output :
RUN bazel build --color=yes --curses=yes --jobs="${JOBS}" --verbose_failures --output_filter=DONT_MATCH_ANYTHING --config=cuda --config=nativeopt --config=jetson --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server && cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server
---> Running in c271cf5b58c4
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
INFO: Elapsed time: 33.719s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
The command '/bin/sh -c bazel build --color=yes --curses=yes --jobs="${JOBS}" --verbose_failures --output_filter=DONT_MATCH_ANYTHING --config=cuda --config=nativeopt --config=jetson --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server && cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server' returned a non-zero code: 1nsorflow; fetching 21s
Should I open a new issue ? Not sure what to take a look at, here.
Did you figure this out, @deaffella ? I'm planning to do that if I can, to make it simpler to serve my a model on my tx2. Basically the issue I have with the images Helmut is referring to above are too large for my device (which already has some things on it and the images uses 6+GB). Trying to build with bazel, I get this output :
RUN bazel build --color=yes --curses=yes --jobs="${JOBS}" --verbose_failures --output_filter=DONT_MATCH_ANYTHING --config=cuda --config=nativeopt --config=jetson --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server && cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server ---> Running in c271cf5b58c4 Extracting Bazel installation... Starting local Bazel server and connecting to it... ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch() ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch() INFO: Elapsed time: 33.719s INFO: 0 processes. FAILED: Build did NOT complete successfully (0 packages loaded) The command '/bin/sh -c bazel build --color=yes --curses=yes --jobs="${JOBS}" --verbose_failures --output_filter=DONT_MATCH_ANYTHING --config=cuda --config=nativeopt --config=jetson --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server && cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server' returned a non-zero code: 1nsorflow; fetching 21s
Should I open a new issue ? Not sure what to take a look at, here.
We have the same thing Any new developments?
@ewirbel,
Can you take a look at this link which contains docker image of TF serving for Jetson Xavier and let us know if it helps? Thanks!
@ewirbel,
Closing this issue due to lack of recent activity. Please feel free to reopen the issue with more details if the problem still persists. Thanks!
Feature Request
Describe the problem the feature is intended to solve
I am trying to get TF serving 1.13 with GPU support (server side api) running on a Jetson AGX Xavier board. I have managed to use the Tensorflow pip wheel provided by NVidia, and the to install the client side python package, but I need the model server (to run remote inferences on the board).
Describe the solution
Provide docker images for aarch64, with GPU support, or provide a toolchain for aarch64.
Describe alternatives you've considered
I have unsuccessfully tried to build tensorflow serving from source:
the docker-ce client is not available for aarch64 so I cannot run the docker installation (and I cannot find official docker images for the board from NVidia)
I tried to replicate what is in the Dockerfile, by installing bazel, cloning the serving Github and running the same bazel command as the dockerfile. My bazel version is the following
bazel version WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files: .bazelrc WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". INFO: Invocation ID: 7aaba226-9820-41a1-90d8-685da07742f5 Build label: 0.20.0- (@non-git) Build target: bazel-out/aarch64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar Build time: Wed Mar 13 14:49:38 2019 (1552488578) Build timestamp: 1552488578 Build timestamp as int: 1552488578
When running the bazel build command I get the following error
bazel build --verbose_failures -c opt --config=cuda --config=nativeopt --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server INFO: Invocation ID: 584c76e9-26c5-4440-9927-338e2424fbf8 ERROR: No toolchain found for cpu 'aarch64'. Valid toolchains are: [local_linux: --cpu='local' --compiler='compiler', local_darwin: --cpu='darwin' --compiler='compiler', local_windows: --cpu='x64_windows' --compiler='msvc-cl',] INFO: Elapsed time: 0.322sINFO: 0 processes. FAILED: Build did NOT complete successfully (0 packages loaded)
Additional context
I managed to build TF serving 1.12 with GPU support and bazel 0.15.2