Open dan-zheng opened 3 years ago
Thank you very much Dan! I just tried compiling this with latest Swift nightly, and this (https://gist.github.com/ProfFan/638f61aff223bfcbea94b2ddb026497a) is what I've got. There is one compiler crash, and a lot of errors related to ElementaryFunction
being not exist.
I have got past the ElementaryFunctions
issue with swift build -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN -Xcc -I/usr/include/tensorflow
. Now the problem becomes the non-existence of libx10
swift build -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN -Xcc -I/usr/include/tensorflow
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
/usr/bin/ld.gold: error: cannot find -lx10
clang-10: error: linker command failed with exit code 1 (use -v to see invocation)
<unknown>:0: error: link command failed with exit code 1 (use -v to see invocation)
[0/16] Linking libTensorFlow.so
Actually it's more than this, it appears that somehow _NumericShims
is built but not linked b/c SPM ended the compilation prematurely. No idea what is happening.
Ok I see the problem. X10 needs to be built separately, but is it able to build x10 with an existing tensorflow install? or is it required to use the TF source? Will these two coexist? @BradLarson Could you help me debug this? Thanks a lot!
Hi Fan,
Did you follow "build instructions" above and install pre-built X10 libraries? I believe they're currently available only for macOS and Windows – not Linux unfortunately.
The instructions for "building libraries depending on tensorflow/swift-apis
" comes from this documentation. An alternative to using pre-built X10 libraries is to build them yourself, which should work just fine on Linux using swift.org/download toolchains.
Let me know if you need any help! I'm happy to video call if you'd like.
Hi Dan,
Thanks for the instructions! I have checked the building instructions and wonder if x10 can be built with a system-packaged tensorflow with headers? I think this is a very important question, as if x10 can be built separately then there will be a much higher chance that it will survive TF updates.
Thanks for the instructions! I have checked the building instructions and wonder if x10 can be built with a system-packaged tensorflow with headers? I think this is a very important question, as if x10 can be built separately then there will be a much higher chance that it will survive TF updates.
Sure thing! I believe @compnerd can provide a more accurate answer to your question about system-packed TensorFlow and X10. I recall discussing such things before - using a system package manager seems more heavyweight and platform-specific, but maybe it's more robust against breakages as you suggest.
@ProfFan - When building a Swift for TensorFlow toolchain from scratch, X10 and TensorFlow are built from a specified TensorFlow version, and you have to manually move that version up to build against a new version of TensorFlow. In the worst case, you can still build these libraries as part of building a stock toolchain + swift-apis from scratch.
I don't know if these steps are documented anywhere, so I'll write down the sequence of commands needed to create a new toolchain based on the stock Swift compiler from scratch:
export TF_NEED_CUDA=1
mkdir swift-source
cd swift-source
git clone https://github.com/apple/swift.git
./swift/utils/update-checkout --clone --skip-repo swift
./swift/utils/build-toolchain buildbot_linux
git clone https://github.com/tensorflow/swift-apis.git
cmake -B BinaryCache -D BUILD_SHARED_LIBS=YES -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/media/nvidia/Data/Development/swift-source/swift-nightly-install/usr -D CMAKE_Swift_COMPILER=/media/nvidia/Data/Development/swift-source/swift-nightly-install/usr/bin/swiftc -D TENSORFLOW_USE_STANDARD_TOOLCHAIN=YES -G Ninja -S ./swift-apis
cmake --build BinaryCache --target install
tar -czf swift-tensorflow-stock-Jetson.tar.gz -C swift-nightly-install/ usr
(you may need to alter a few of the hardcoded paths above, this was a quick copy-paste)
For a Jetson build, you also need to add the following at the beginning to specify CUDA architectures:
export TF_CUDA_COMPUTE_CAPABILITIES=compute_53,compute_62,compute_72
In the process of building this, all headers and binaries are generated for X10 and TensorFlow. I can extract and package these for Ubuntu, based on our 0.13 toolchains. That should contain everything you'd need to build swift-apis as a package, and would serve as long as you didn't need to advance beyond TensorFlow 2.4.0. Would that be useful to have? If so, which Ubuntu configurations would be most useful to focus on?
OK, I tried it out and I think my idea of extracting the binary libraries from the completed toolchains will work. This is a version of the X10 standalone libraries (with TensorFlow headers) that builds on Ubuntu 18.04, CPU-only, with Dan's setup here. You might need to find the right Swift toolchain to use, however, because the zeroTangentVector
changes upstream look like they might cause problems here.
If you want me to, I can create X10 snapshots from all of our Ubuntu variants and add them to the Windows and macOS snapshots linked on our development page.
@BradLarson Thanks a lot Brad! One last question - is it possible to build X10 with only the TF headers in a vendor install of TF? For example Arch Linux ships TF with full headers as a prebuilt package. In my experiments the X10 cmake seems to be always cloning from GitHub the full source tree.
But you are right, since Swift lives in a prefix we can definitely ship the TF libraries with the toolchain (separate from system TF) as well.
@ProfFan - I don't believe that libx10 can be built without access to the TensorFlow source, due to its need to compile in elements of XLA. Not entirely sure if the same is true for our eager-mode access, but I believe we build that in, too. Our toolchains exist independently of the system-installed TensorFlow, as does a binary library package like the one I linked above, and don't make use of it if it is available. Our TensorFlow support is pretty much standalone.
@BradLarson Thanks for the explanation! That is totally good :)
I've created both CUDA 11 and CPU-only Ubuntu 18.04 X10 packages and linked them here: https://github.com/tensorflow/swift-apis/pull/1182 . I figured those would be the two most popular platforms for people carrying this on in the near term, but can add others if needed.
Motivation
This enables building SwiftFusion using stock toolchains from swift.org/download.
swift build
will clone and buildtensorflow/swift-apis
as a regular SwiftPM dependency. Eventually, we would like to stop releasing custom toolchains bundled with pre-installedtensorflow/swift-apis
.Build instructions
It is possible to build
tensorflow/swift-apis
and dependencies like SwiftFusion using stock toolchains by installing pre-built X10 libraries (currently available only for macOS and Windows).After installing (e.g. to
$HOME/Library
on macOS), build with SwiftPM via the following:swift test
is known not to work on macOS fortensorflow/swift-apis
and dependencies due to SR-14008:Library not loaded: /usr/lib/swift/libswift_Differentiation.dylib
.Testing
Before merging, let's verify that
swift build
,swift run
, andswift test
works for swift.org/download toolchains across platforms, and update GitHub Actions CI so that it passes:swift run
andswift test
currently both fail due to SR-14008:```console $ swift run Pose3SLAMG2O -Xcc -I$HOME/Library/tensorflow-2.4.0/usr/include -Xlinker -L$HOME/Library/tensorflow-2.4.0/usr/lib -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN ... Everything is already up-to-date dyld: Library not loaded: /usr/lib/swift/libswift_Differentiation.dylib Referenced from: /Users/danielzheng/SwiftFusion/.build/x86_64-apple-macosx/debug/Pose3SLAMG2O Reason: image not found [1] 79788 abort swift run Pose3SLAMG2O -Xcc -I$HOME/Library/tensorflow-2.4.0/usr/include ``` ```console $ swift test -Xcc -I$HOME/Library/tensorflow-2.4.0/usr/include -Xlinker -L$HOME/Library/tensorflow-2.4.0/usr/lib -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN ... Everything is already up-to-date 2021-01-08 07:14:48.425 xctest[79757:2116295] The bundle “SwiftFusionPackageTests.xctest” couldn’t be loaded because it is damaged or missing necessary resources. Try reinstalling the bundle. 2021-01-08 07:14:48.425 xctest[79757:2116295] (dlopen_preflight(/Users/danielzheng/SwiftFusion/.build/x86_64-apple-macosx/debug/SwiftFusionPackageTests.xctest/Contents/MacOS/SwiftFusionPackageTests): Library not loaded: /usr/lib/swift/libswift_Differentiation.dylib Referenced from: /Users/danielzheng/SwiftFusion/.build/x86_64-apple-macosx/debug/SwiftFusionPackageTests.xctest/Contents/MacOS/SwiftFusionPackageTests Reason: image not found) ```