Closed jmtatsch closed 7 years ago
No, JetPack does not support running directly on the L4T platform.
I meant if you have flashed the board with jetpack 2 to have cuda 7 support.
Ah, yes I have Cuda 7 support and used jetpack 2. To be more precise, the target is not actually the Jetson TX1 but an repurposed Nvida Sield TV flashed to L4T 23.1 for Jetson.
@Yangqing FYI
I think there is a TX1 that I could use to take a look. I'll see what I can do.
In theory, can TensorFlow run usefully on the TK1? Or is the 2G memory too small for, say, face verification?
@robagar It all depends on how large your network is and whether you intend to train the model on TK1 or just run inference. Two GB of memory is plenty to run inference on almost any model.
I have worked around an issue that prevented nvcc from compiling the Eigen codebase on Tegra X1 (https://bitbucket.org/eigen/eigen/commits/d0950ac79c0404047379eb5a927a176dbb9d12a5). However, so far I haven't succeeded in setting up bazel on the Tegra X1, so I haven't been able to start working on the other issues reported in http://cudamusing.blogspot.de/2015/11/building-tensorflow-for-jetson-tk1.html
That's good news ;) Whats the problem with bazel? maxcuda's instructions for building bazel worked quite well for me..
For building bazel I had to use a special java build which can cope with the 32bit rootfs on a 64bit machine
wget http://www.java.net/download/jdk8u76/archive/b02/binaries/jdk-8u76-ea-bin-b02-linux-arm-vfp-hflt-04_jan_2016.tar.gz
sudo tar -zxvf jdk-8u76-ea-bin-b02-linux-arm-vfp-hflt-04_jan_2016.tar.gz -C /usr/lib/jvm
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.8.0_76/bin/java" 1
sudo update-alternatives --config java
There seems to be one eigen issue I can't get around:
bazel build -c opt --local_resources 2048,0.5,1.0 --verbose_failures --config=cuda //tensorflow/cc:tutorials_example_trainer
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
INFO: Found 1 target...
INFO: From Compiling tensorflow/core/kernels/cross_op_gpu.cu.cc:
At end of source: warning: routine is both "inline" and "noinline"
external/eigen_archive/eigen-eigen-c5e90d9e764e/unsupported/Eigen/CXX11/src/Tensor/TensorEvaluator.h(125): warning: routine is both "inline" and "noinline"
At end of source: warning: routine is both "inline" and "noinline"
external/eigen_archive/eigen-eigen-c5e90d9e764e/unsupported/Eigen/CXX11/src/Tensor/TensorEvaluator.h(125): warning: routine is both "inline" and "noinline"
./tensorflow/core/lib/strings/strcat.h(195): internal error: assertion failed at: "/dvs/p4/build/sw/rel/gpu_drv/r346/r346_00/drivers/compiler/edg/EDG_4.9/src/decl_inits.c", line 3251
1 catastrophic error detected in the compilation of "/tmp/tmpxft_0000682d_00000000-8_cross_op_gpu.cu.cpp4.ii".
Compilation aborted.
Aborted
ERROR: /opt/tensorflow/tensorflow/core/BUILD:331:1: output 'tensorflow/core/_objs/gpu_kernels/tensorflow/core/kernels/cross_op_gpu.cu.o' was not created.
ERROR: /opt/tensorflow/tensorflow/core/BUILD:331:1: not all outputs were created.
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 2271.358s, Critical Path: 2260.25s
Can you have a look at TensorEvaluator.h please?
I still haven't been able to install bazel. That said, the assertion you're facing seems to be triggered by the variadic template at line 195 of ./tensorflow/core/lib/strings/strcat.h. I would just comment this code and see how it goes.
When you say maxcuda has "been unable to repeatedly build it" since then, does that mean that tensorflow is no longer working on the TK1 again? Because I just ordered the TK1 with the express purpose of being able to run tensorflow :-/
Yes, I have been unable to recompile the latest versions. The wheel I built around Thanksgiving should still work but it is quite an old version.
Commenting the variadic template at line 195 helps a little but at line 234 there is a another template that seems to be required. Any hints how to rewrite that in nvcc friendly manner?
@benoitsteiner any suggestions how this could be rewritten in a nvcc compatible manner?
// Support 5 or more arguments
template <typename... AV>
inline void StrAppend(string *dest, const AlphaNum &a, const AlphaNum &b,
const AlphaNum &c, const AlphaNum &d, const AlphaNum &e,
const AV &... args) {
internal::AppendPieces(dest,
{a.Piece(), b.Piece(), c.Piece(), d.Piece(), e.Piece(),
static_cast<const AlphaNum &>(args).Piece()...});
}
@damienmg FYI
Hi folks, I'm also working on building everything from scratch on tx1. There is lots of discussions here and also on nvidia developer forums. But by now I haven't seen any well summarized instruction besides that tk1's. Can we start another repo or script file so people can work on it more efficient?
Imho we have to first solve the fundamental issue of the variadic templates not working with nvcc. Either the developers would have to do without those templates which is backwards and probably not going to happen or nvidia has to step up and make nvcc more compatible? In theory nvcc should already be able to deal with your own variadic templates, but external e.g. STL headers won't "just work" because of the need to annotate all functions called on the device with "host device". Maybe someone knows a good way how to get around this issue....
@jmtatsch At the moment, the version of cuda that is shipped with the tegra x1 has problems with variadic templates. Nvidia is aware of this and working on a fix. I updated Eigen a few weeks ago to disable the use of variadic templates when compiling on tegra x1, and that seems to fix the bulk of the problem. However, StrCat and StrAppend still rely on variadic templates. Until nvidia releases a fix, the best solution is to comment out the variadic versions of StrCat and StrAppend, and create non variadic versions of StrCat and StrAppend with up to 11 arguments (since that's what TensorFlow currently needs). There are a couple of ways to avoid the STL issues: a brittle solution is to only compile optimized kernels. The compiler then inlines the STL code at which point the lack of host device annotation doesn't matter since there is no function call to resolve. A better solution is to replace all the STL functionality with custom code. We've started to do this in Eigen by reimplementing most of the STL functions we need in the Eigen::numext namespace. This is tedious by much more reliable than relying on inlining to bypass the problem.
I have a build of TF 0.8 but it requires a new 7.0 compiler that is not yet available to the general public. I am building a wheel on a Jetson TK1, I will make it available after some testing. I will update the instructions on how to build from source on cudamusing.
Good work @maxcuda! Will it build on the TX1 too?
Yes, it will build on TX1 too. I fixed a problem with the new memory allocator to take in account the 32bit OS. Some basic tests are passing but the label_image test is giving the wrong results so there may be some other places with 32bit issues.
@benoitsteiner , with the new compiler your change to Eigen is not required anymore ( and it is forcing to edit a bunch of files). Could you please remove the check and re-enable variadic templates ?
@maxcuda Where can I download the new cuda compiler? I'd like to make sure that I don't introduce new problems when I enable variadic templates again.
@maxcuda is the new 7.0 compiler you were referencing part of Jetpack 2.2 that was just released?
Yes, you can get it with: wget http://developer.download.nvidia.com/embedded/L4T/r24_Release_v1.0/CUDA/cuda-repo-l4t-7-0-local_7.0-76_armhf.deb
The good news are that I was able to build v0.8 but some of the results are incorrect. I will update the blog with the changes. With v0.9 I had problem with the cudnn.cc file, it looks like it cannot handle cuddn v2.
Thanks so much. Looking forward to your post so I can get tensorflow running on the TX1
I updated my building instruction on cudamusing and also posted a wheel file.
Has anyone tested this on jetson tx1? I can't seem to get bazel build on aarch64.
@syed-ahmed I tested it on TX1. This is my configurations.
@syed-ahmed I got it to build on an aarch64 TX1. I mostly followed the instructions for the TK1 at cudamusing.blogspot.de. The only additional things I did were
Or, if you prefer, here is the bazel executable for aarch64 I ended up with: https://drive.google.com/file/d/0B8Gc_oVaYC7CWEhOMHJhc0hLY0U/view?usp=sharing
Maybe make a PR against bazel?
On Wed, Jul 6, 2016 at 8:38 AM Tyler Fox notifications@github.com wrote:
@syed-ahmed https://github.com/syed-ahmed I got it to build on an aarch64 TX1. I mostly followed the instructions for the TK1 at cudamusing.blogspot.de. The only additional things I did was
- add aarch64 to the ARM enum in /bazel/src/main/java/com/google/devtools/build/lib/util/CPU.java by changing line 28 to "ARM("arm", ImmutableSet.of("arm", "armv7l", "aarch64"))," without quotes
- Added aarch64 as valid ARM machine type in /bazel/scripts/bootstrap/buildenv.sh by changing line 35 to "if [ "${MACHINE_TYPE}" = 'arm' -o "${MACHINE_TYPE}" = 'armv7l' -o "${MACHINE_TYPE}" = 'aarch64' ]; then" without quotes
Or, if you prefer, here is the bazel executable for aarch64 I ended up with: https://drive.google.com/file/d/0B8Gc_oVaYC7CWEhOMHJhc0hLY0U/view?usp=sharing
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/851#issuecomment-230810921, or mute the thread https://github.com/notifications/unsubscribe/AAjO_SFJWCHTe1vT-jcv8t5tp51x9clmks5qS8vjgaJpZM4HK5_C .
@tylerfox Thank you! I'll try your suggestions. In the meanwhile, any thoughts on this: https://github.com/bazelbuild/bazel/issues/1264 and @wtfuzz 's change for cc_configure.bzl. I was getting a toolchain error. So wondering if you encountered it.
Did you also build with latest bazel release or 0.1.4.? And how about the tensorflow version - r0.8?
@syed-ahmed yes, changing the buildenv.sh should fix that issue. Also it's worth noting that I used bazel 0.1.4 per the instructions on cudamusing. I should probably also test on the current version of bazel, but for now I know 0.1.4 works
I am trying to build the tensorflow r0.9 release. I got bazel 0.2.1 installed following @tylerfox 's suggestions. Getting this following error when trying to build tensorflow. Any thoughts? Appreciate all the help.
>>>>> # @farmhash_archive//:configure [action 'Executing genrule @farmhash_archive//:configure [for host]']
(cd /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow && \
exec env - \
PATH=/usr/local/cuda-7.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ubuntu/bazel/output/ \
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; pushd external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260; workdir=$(mktemp -d -t tmp.XXXXXXXXXX); cp -a * $workdir; pushd $workdir; ./configure; popd; popd; cp $workdir/config.h bazel-out/host/genfiles/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260; rm -rf $workdir;')
ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/external/farmhash_archive/BUILD:5:1: Executing genrule @farmhash_archive//:configure failed: bash failed: error executing command
(cd /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow && \
exec env - \
PATH=/usr/local/cuda-7.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ubuntu/bazel/output/ \
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; pushd external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260; workdir=$(mktemp -d -t tmp.XXXXXXXXXX); cp -a * $workdir; pushd $workdir; ./configure; popd; popd; cp $workdir/config.h bazel-out/host/genfiles/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260; rm -rf $workdir;'): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260 /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow
/tmp/tmp.ZKGtjQ4mLO /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260 /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/tensorflow
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking build system type... /tmp/tmp.ZKGtjQ4mLO/missing: Unknown `--is-lightweight' option
Try `/tmp/tmp.ZKGtjQ4mLO/missing --help' for more information
configure: WARNING: 'missing' script is too old or missing
./config.guess: unable to guess system type
This script, last modified 2010-08-21, has failed to recognize
the operating system you are using. It is advised that you
download the most up to date version of the config scripts from
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
and
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD
If the version you run (./config.guess) is already up to date, please
send the following data and any information you think might be
pertinent to <config-patches@gnu.org> in order to provide the needed
information to handle your system.
config.guess timestamp = 2010-08-21
uname -m = aarch64
uname -r = 3.10.96-tegra
uname -s = Linux
uname -v = #1 SMP PREEMPT Tue May 17 16:29:05 PDT 2016
/usr/bin/uname -p =
/bin/uname -X =
hostinfo =
/bin/universe =
/usr/bin/arch -k =
/bin/arch =
/usr/bin/oslevel =
/usr/convex/getsysinfo =
UNAME_MACHINE = aarch64
UNAME_RELEASE = 3.10.96-tegra
UNAME_SYSTEM = Linux
UNAME_VERSION = #1 SMP PREEMPT Tue May 17 16:29:05 PDT 2016
configure: error: cannot guess build type; you must specify one
Anyone knows what farmhash is being used for in tensorflow r0.9? My motivation for installing tensorflow 0.9 on the jetson tx1 is to solely utilize some of the fp16 ops. Hence, if farmhash is not doing anything important, may be I could remove the firmhash related code and build without it. Here is the farmhash commit.
Temporary sources used in the build process can be found in ~/.cache/bazel. Cd to this directory and search for config.guess: find ./ -name "config.guess".
You might get several files but the paths should give you a clue which config.guess is the one of farmhash. In my case it is ./_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260/config.guess
In this file replace line
UNAME_MACHINE=(uname -m) 2>/dev/null
|| UNAME_MACHINE=unknown
with
UNAME_MACHINE=armhf
On my machine (Nvidia Shield TV flashed to L4T 23.1) farmhash built successfully after this change.
I successfully build the tensorflow on TX1 24.1 64 bit, with the following patch. But, run example failed with following kernel message.
tutorials_examp[31026]: unhandled level 1 translation fault (11) at 0xffffffffffffe8, esr 0x92000005
Maybe farmhard.BUILD with --build=arm-linux-gnu
is wrong? But, I failed to compile it with --build=aaarch64-linux-gnu
. I'm still trying to figure out what caused the runtime fails.
tx1_patch.zip
@benoitsteiner has reenablling variadic templates been verified to work?
@shingchuang have you found the root cause of segmentation fault issue? I have the same problem on aarch64 platform.
I tried to reenable variadic templates last night after upgrading the cuda compiler using http://developer.download.nvidia.com/embedded/L4T/r24_Release_v1.0/CUDA/cuda-repo-l4t-7-0-local_7.0-76_armhf.deb. This new compiler appears to fix some of the issues but I still get some crashes.
I noticed that nvidia released an even more recent version of the compiler. @maxcuda, is there a debian package that I can use to install the latest version of the cuda sdk ?
Re-install / re-flash using JetPack 2.3 because the latest release also updated to Ubuntu 16.04 aarch64 in addition to CUDA 8 and L4T R24.2. The underlying CUDA version is tied to the L4T BSP in JetPack.
Hi all. I'm trying to build TensorFlow for the Google Pixel C in order to use the GPU TX1. Do you build it on your machine (e.g. Mac) or on the device itself (e.g. Pixel C)? Does anyone have the already generated files for TX1 or can point me in the right direction? Thanks.
Hi all - haven't gotten TensorFlow r0.11 working yet, but do have a working path to r0.9 TensorFlow install on TX1 with JetPack 2.3. Have tested basic nets MLP/LSTM/Conv and seems to work, though it OOMS out pretty easily on bigger convs.
Wrote down all my steps and patches below if it's helpful to anyone. Really appreciated all above commentary was critical to tracking down right path.
http://stackoverflow.com/questions/39783919/tensorflow-on-nvidia-tx1/
@dwightcrow , I tried your solution, and it works on TX1, thank you. And the version 0.11.0rc0 can be built with bazel with version of 0.3.2
That's fantastic. Bazel 0.3.2 builds fairly easily on TX1?
Wondering if there's a concise summary of everything in this issue? It would definitely make it easier for others trying to get TF working on a TX1.
Following up on the request for a summary to build tensorflow on a Jetson TX1. Any help is appreciated.
Hello,
@maxcuda has recently got tensorflow running on the tk1 as documented in blogpost http://cudamusing.blogspot.de/2015/11/building-tensorflow-for-jetson-tk1.html but since then been unable to repeatedly build it. I am now trying to get tensorflow running on a tx1 tegra platform and need some support.
Much trouble seems to come from Eigen variadic templates and using C++11 initializer lists, both of wich could work according to http://devblogs.nvidia.com/parallelforall/cplusplus-11-in-cuda-variadic-templates/. In theory std=c++11 should be set according to crosstool. Nevertheless, nvcc crashes happily on all of them. This smells as if the "-std=c++11" flag is not properly set. How can I verify/enforce this?
Also in tensorflow.bzl, variadic templates in Eigen are said to be disabled
We have to disable variadic templates in Eigen for NVCC even though std=c++11 are enabled
is that still necessary?Here is my build workflow: