Open zachgrayio opened 6 years ago
Thanks for providing so much detail! I'm looking into this now.
I was able to replicate the issue:
$ docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample
Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
1> import TensorFlow
2> import RxSwift
3> Tensor(1)
error: Couldn't lookup symbols:
protocol witness table for Swift.Double : TensorFlow.AccelerableByTensorFlow in TensorFlow
_swift_tfc_StartTensorComputation
_swift_tfc_FinishTensorComputation
direct field offset for TensorFlow.TensorHandle.cTensorHandle : Swift.OpaquePointer
type metadata accessor for TensorFlow.TensorHandle
3> var x = Tensor([[1, 2], [3, 4]])
x: TensorFlow.Tensor<Double> =terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
The solution is to add an extra -lswiftTensorFlow
flag:
$ docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample -lswiftTensorFlow
Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
1> import RxSwift
2> import TensorFlow
3> _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
4> var x = Tensor([[1, 2], [3, 4]])
2018-04-27 23:07:35.467557: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
x: TensorFlow.Tensor<Double> = [[1.0, 2.0], [3.0, 4.0]]
I tested the Swift interpreter by putting the code into test.swift
, then running:
docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample -O /usr/src/test.swift
This worked without specifying -lswiftTensorFlow
, suggesting the problem is probably REPL-specific and involves linker flags.
On Linux, the Swift shared runtime library path is found at <path_to_toolchain>/usr/lib/swift/linux
. It contains shared libraries like libswiftCore.so
, libswiftTensorFlow.so
, libswiftPython.so
, etc.
In lib/Driver/Toolchains.cpp (used by the interpreter/compiler), toolchains::GenericUnix::constructInvocation
automatically adds flags that add the Swift shared runtime library path and link libswiftCore.so
. Ostensibly, there's other logic for handling other libraries in the same path (like libswiftPython.so
) but I couldn't find it.
The REPL uses entirely separate logic for linking libraries (somewhere in google/swift-lldb). I'll do some digging and try to fix this.
This linking is probably related to #5.
@dan-zheng - nice work man. This is exactly what I was missing. See the following:
docker run --rm --privileged --cap-add sys_ptrace -itv ${PWD}:/usr/src \
zachgray/swift-tensorflow:4.2 \
swift \
-I/usr/lib/swift/clang/include \
-I/usr/src/TFExample/.build/debug \
-L/usr/src/TFExample/.build/debug \
-lswiftPython \
-lswiftTensorFlow \
-lTFExample
Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
1> import RxSwift
2> import Python
3> import TensorFlow
4> var x = Tensor([[1, 2], [3, 4]])
2018-04-28 00:11:10.828554: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
x: TensorFlow.Tensor<Double> = [[1.0, 2.0], [3.0, 4.0]]
5> _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
6> var x: PyValue = [1, "hello", 3.14]
x: Python.PyValue = [1, 'hello', 3.14]
7> :exit
** edited formatting
I'm working on a simple fix now.
Regarding import order: I didn't notice errors when importing Python
before TensorFlow
so that's the order I'll use.
I believe this is fixed in 1969380862d0db8ab090325e878e1ca2969ed2d6. You can try the pre-built packages from 05-10 to verify.
This should be able to be solved by -module-link-name . That avoids this hack into the compiler.
I've reproduced linking this way outside of the swift compiler. This is also the way that foundation and xctest works. That avoids the problem of linking these libs into every binary if they are needed or not.
Continuing our discussion from the group here.
Full background - I've just copied my comment directly from the group:
I've had some success in using third-party SPM packages by creating a dynamic library and linking to it when launching the REPL, however, it seems like the import order of TensorFlow vs other packages is important; importing the 3rd-party lib first causes a C++ runtime error in TensorFlow.
Here's some snippets:
Package.swift
... then we just fetch dependencies and build with vanilla commands, then invoke the REPL:
Invocation
swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample
At this point, I'm able to import RxSwift and TensorFlow in the REPL without errors in any order; however, when I actually interact with the packages, the incorrect import order does result in a runtime error:
Scenario 1 (OK)
Scenario 2 (runtime error)
The full process is outlined here if more detail is necessary: https://github.com/zachgrayio/swift-tensorflow/blob/example/package/README.md#run-with-dependencies-advanced