Requesting assistance - Githubissues

philipturner commented 2 years ago

From @compnerd’s comment in https://github.com/tensorflow/swift-apis/pull/1184

The project doesn't seem particularly enticing to me - the "rebuilding from the ground up for GPU acceleration through Metal" is a big component of that. If you were designing it for optimization for DCOM+ (DirectX is effective accessed through COM), that might be different. GPU environments are quite fragmented and designing around a particular one is a choice that I do not particularly agree with.

One reason I started this project is I didn’t like that S4TF ran exclusively on CUDA. I planned to bring Metal to it, but I’m open to more platforms. I started trying Swift on Windows, although I couldn't get the REPL to work (even with the workaround @compnerd recently posted). I am interesting in learning DirectX and making a backend for DirectML. I would author it in C++, then bind it to Swift through C at call sites for raw operations.

This also seems very much as other platforms are an afterthought. The overall structure of S4TF is far better from a design standpoint - it generically solves the problem in software and has abstractions over a compilation pipeline to separate the concern of optimization and device/runtime independence.

I read the research paper on S4TF, and I realized that the eager execution model of TensorFlow was much more achievable than what I had planned for DL4S. I then considered using an industry-standard interface like XLA for interfacing with both the Metal and DirectX backends. That investigation led me to greatly appreciate S4TF’s public API. I would rather reuse the code base of S4TF than overhaul DL4S. I have been changing my mind left and right over the past week, so my decision isn’t concrete.

I am considering forking tensorflow/swift-apis and adding the new backends to that project. In addition, I will add iOS/tvOS support by removing the need for a separate Swift toolchain - one of the major reasons @palle-k made DL4S. Differentiation is disabled in Swift’s release build, but I found a short-term workaround by wrapping the Swift standard library’s differentiation directory in a Swift package. Ideally, S4TF 0.14.0 would still work with tensorflow/swift-models and fastai/swiftai without any modifications to those repos. The contributions would be purely additive.

philipturner commented 2 years ago

From the Apache 2.0 license, I am allowed to do this as long as I properly document my modifications. There are also precedents of Apple and Microsoft making re-implementations of TensorFlow for MLCompute and DirectML. I am fully aware that I do not own the TensorFlow trademark, and will find a solution that respects Google’s right to the trademark.

Assuming I stay on track with my current plans, I would prefer to have my changes merged into the main branch and get S4TF un-archived. I re-read the last Open Design Review meeting presentation and observed a strange pattern of cancellations leading up to it. I infer that the S4TF team did not give up because they lost funding, but because they got burnt out.

However, Google has a precedent of donating projects they gave up on to an open community - Google Cardboard. A similar thing also happened with a Swift framework, Kitura (although not from Google). There are numerous software developers who would like to see S4TF revived, including @RahulBhalley.

As an alternative, I could maintain a fork of Swift for TensorFlow, which would be the community-hosted continuation. I would ask for one final PR to tensorflow/swift, adding my resurrection to the list of spin-off projects. I assume Google is unwilling to revive the main S4TF repository, so I would ask you to validate my commits to the fork instead.

Ideally, I would like to work with the former S4TF team and have their assistance. I expect to spend 1000 hours on this project, but the end product may be less professional and reach a narrower audience without your endorsement. In addition, I may accidentally break CUDA support since I can’t run the rigorous validation tests configured on tensorflow/swift-apis.

TL;DR - Please let me know your stance on me contributing to Swift for TensorFlow.

ProfFan commented 2 years ago

Hi Philip,

Just a few cents on my end. I definitely would like to see this project resurrected, but there are a few challenges that I can think of.

Biggest one is the toolchain. Autodiff is in Swift mainline, and there is already a lot of adoption (mainly in Linux world), but a big caveat here is that it hasn't been finialized via Swift Evolution, if I remember correctly. That means at any given time, autodiff may get removed. I am very willing to vote (and even work in weekends) for it, and I think a lot of other people do, but it has to be done. Since @rxwei is the most reliable source on this end, please forgive me if my comment above is wrong.

The last thread on evolution is https://forums.swift.org/t/differentiable-programming-for-gradient-based-machine-learning/42147/96

After the adoption of RequirementMachine, quite a few bugs have been detected in the autodiff code, which further increased the urgency of fixes and improvements. For example,

These are the reasons why the PR you have seen is still pending. Actually in debug mode the unit tests already all pass, but in Release mode the compiler crashes...

I really appreciate your work in DL4S, and as you have already observed, the architecture of swift-apis is so elegant that most of the code is generic: it's almost like the only work needed for another backend is changing Tensor... sans making the compiler work.

Fan

philipturner commented 2 years ago

I anticipate that my work won't be completed for up to a year. Would that be enough time for the Swift Standard Library to support differentiable without importing my package?

https://github.com/philipturner/Differentiation

If the S4TF team is up for at least considering the resurrection of this project, then I'll put my time toward getting differentiation stable in the Swift standard library. It should get fixed entirely within at most a few weeks.

ProfFan commented 2 years ago

@philipturner I am just a fellow user (our package SwiftFusion builds on S4TF). Your package looks great! If you can make a demo of this working in iOS it may attract a lot of people in the Swift Forums.

AFAIK, I am fairly confident that most of the old team is still willing to resurrect this project, just they all now have now jobs and cannot devote time. We need to work out most of the issues ourselves so that the project gets momentum again.

philipturner commented 2 years ago

How about I do the most time-consuming work of hunting down the source of bugs. That means you could still work it out yourselves, sans the time cost.

The reason I have time is because I’m not yet in college (a year left though, hence the time constraint). Thus, I have an unusual capacity to contribute compared to you all.

Sent from my iPhone

On Dec 2, 2021, at 8:54 PM, Fan Jiang @.***> wrote:

@philipturner I am just a fellow user (our package SwiftFusion builds on S4TF). Your package looks great! If you can make a demo of this working in iOS it may attract a lot of people in the Swift Forums.

AFAIK, I am fairly confident that most of the old team is still willing to resurrect this project, just they all now have now jobs and cannot devote time. We need to work out most of the issues ourselves so that the project gets momentum again.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

philipturner commented 2 years ago

I might be able to pull off an ARHeadsetKit demo of differentiation by tomorrow or the day after. Would it be appropriate to post a low-resolution video or gif on Swift forums?

I may have misunderstood what you were considering I make a demo of. Differentiation or S4TF?

ProfFan commented 2 years ago

Oh I mean differentiation (some application that is enabled by differentiation)

philipturner commented 2 years ago

Epic. And would videos/gifs be appropriate for Swift forums? I think this is the perfect opportunity since it will give a visual impression and be downloadable (as an Xcode project) and interactive.

philipturner commented 2 years ago

With Swift 5.6, we could transfer the Colab notebooks to something similar to Apple's SwiftUI tutorials. I made numerous ARHeadsetKit tutorials in DocC and they're very effective. The content should transfer over easily, and could be hosted on a GitHub Pages website.

A discussion about this is already happening on Swift Forums: https://forums.swift.org/t/support-hosting-docc-archives-in-static-hosting-environments/53572

philipturner commented 2 years ago

@ProfFan the demo has been posted on Swift Forums, although it got caught in the spam filter. We will need to wait a bit before the post goes public.

Repository: https://github.com/philipturner/differentiation-ios-demo
Swift Forums post: https://forums.swift.org/t/swift-for-tensorflow-resurrection-differentiation-running-on-ios/53841/1

philipturner commented 2 years ago

The S4TF toolchain was broken by concurrency in Swift 5.5. I can't use it in Xcode 13.

Also, the following tutorial is too far off from the current state of differentiation to be salvageable in DocC: https://github.com/tensorflow/swift/blob/main/docs/site/tutorials/custom_differentiation.ipynb

After digesting all the information in the Swift Differentiable Implementation Programming Overview, I'm going to convert it to Markdown, which was requested at the top of the document. This should help me with solving the compiler bugs you are currently facing. In addition, anyone in the public can dismiss a comment in the Google Doc, which is not good. That won't happen with GitHub comments.

I'll try as hard as I can to preserve it in its original form so we can discuss and agree on changes. Therefore, it would be better if I uploaded the Markdown file to a repository you owned, so I couldn't make official changes without your validation. @compnerd @rxwei @BradLarson would a branch of tensorflow/swift-apis work for this purpose? Preferrably main or a new branch created from it. We would then discuss modifications in a pull request to the branch.

This PR would also let me fill out the Google contribution agreement that @ProfFan did when making the PR to this repo.

philipturner commented 2 years ago

Would work on S4TF be okay if we just get VJP working (and not JVP, linear maps, transposition)?

BradLarson commented 2 years ago

It's great to hear about the interest in carrying on work around differentiable Swift and the Swift for TensorFlow APIs / models.

Focusing specifically on what it would take to build tensorflow/swift-apis once again (a first step to getting things operational again), @ProfFan's PR #1184 brings tensorflow/swift-apis up to the current state of Swift syntax. It is currently blocked by two compiler assertion failures that are being worked on. These don't yet have Jira tickets because I haven't yet arrived at a single-file reproducer for them, but they both are connected to differentiation through functions with inout values and control flow (if statements, for loops, etc.). Those will both need to be resolved to unblock that PR, and we also have local code that's impacted by these assertion failures.

A new and I believe related assertion failure just started appearing with the 2021-12-04 nightly snapshots, and that I was able to reduce down to a reproducer in SR-15566. I'm guessing that will also cause problems with the above-linked PR and will need to be resolved as well. I'm also looking into this, which is easier with the reproducer.

I think all of the autodiff issues that the new Requirement Machine has exposed have since been resolved, so we're good on those right now. All of the current issues are clustered around functions with control flow, and those involving inouts in particular.

Regarding reverse- and forward-mode differentiation, right now only reverse-mode is enabled and within the scope of the initial implementation of differentiable Swift. That should be perfectly fine for almost all of the applications you'd have in mind for frameworks leveraging accelerators and automatic differentiation, so while it would be nice to have the other capabilities I think we'll be just fine with reverse-mode for now.

philipturner commented 2 years ago

Note that the compiler also crashed when I differentiated optionals when creating the iOS demo. I’m using the 5.5.1 release toolchain in Xcode, but has this bug been resolved yet?

Sent from my iPhone

On Dec 6, 2021, at 5:51 PM, Brad Larson @.***> wrote:

It's great to hear about the interest in carrying on work around differentiable Swift and the Swift for TensorFlow APIs / models.

Focusing specifically on what it would take to build tensorflow/swift-apis once again (a first step to getting things operational again), @ProfFan's PR #1184 brings tensorflow/swift-apis up to the current state of Swift syntax. It is currently blocked by two compiler assertion failures that are being worked on. These don't yet have Jira tickets because I haven't yet arrived at a single-file reproducer for them, but they both are connected to differentiation through functions with inout values and control flow (if statements, for loops, etc.). Those will both need to be resolved to unblock that PR, and we also have local code that's impacted by these assertion failures.

A new and I believe related assertion failure just started appearing with the 2021-12-04 nightly snapshots, and that I was able to reduce down to a reproducer in SR-15566. I'm guessing that will also cause problems with the above-linked PR and will need to be resolved as well. I'm also looking into this, which is easier with the reproducer.

I think all of the autodiff issues that the new Requirement Machine has exposed have since been resolved, so we're good on those right now. All of the current issues are clustered around functions with control flow, and those involving inouts in particular.

Regarding reverse- and forward-mode differentiation, right now only reverse-mode is enabled and within the scope of the initial implementation of differentiable Swift. That should be perfectly fine for almost all of the applications you'd have in mind for frameworks leveraging accelerators and automatic differentiation, so while it would be nice to have the other capabilities I think we'll be just fine with reverse-mode for now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

philipturner commented 2 years ago

@BradLarson if we can get Google Colab up and running with a workaround, then we should invest time into that after fixing autodiff. I have no other way to test S4TF on CUDA or TPU, as I don't own an NVIDIA GPU. Without DirectX support on launch of S4TF 0.14.0, we might shut off Windows and Linux users from hardware acceleration completely.

We can definitely pull off two key features to make the new S4TF stand out from other ML frameworks - being lightweight (low initialization time), and Metal acceleration that's faster than Apple's Python TensorFlow acceleration.

BradLarson commented 2 years ago

Regarding accelerator support, my older replies in this issue on tensorflow/swift largely still hold true. Anything that layers on top of open source TensorFlow, like the current incarnation of tensorflow/swift-apis, will be limited to only the accelerators that TensorFlow supports. That historically has been restricted to CUDA-compatible hardware or TPUs, but it may be possible now to use the DirectML fork of TensorFlow to address Windows. Apple's TensorFlow fork with Apple Silicon support is not open source, nor does it sound like it will be in the near future, and tensorflow/swift-apis needs access to that source to build against.

Broader accelerator support would most likely be possible only through a replacement of the TensorFlow layer with a different underlying runtime. That would open up many options for accelerators that would be challenging to support via TensorFlow.

As an alternative to Colab, you potentially could spin up an image on GCP with a CUDA-capable GPU (or TPU) attached and use that to test acceleration support.

philipturner commented 2 years ago

I was thinking of disguising MPSGraph/MPSNNGraph for GPU, and MLCGraph for CPU (with Apple's AMX accelerators) as XLA. Ops such as 3D convolutions and Fast Fourier Transforms would execute like in eager mode, but the majority would be compatible with Apple's neural network compilers.

On the first minibatch, all commands would be executed eagerly. After S4TF detects two consecutive calls to update a model's parameters, it would examine the command trace between them and (if it already executed that trace exactly once) start compiling it on a background thread. I anticipate that Apple's graph compilers will take a long time, so S4TF will continue executing eagerly for the first few batches until compilation finishes.

Although there are probably many things wrong with this approach, it is the easiest to accomplish and will produce something faster than eager mode. TVM may be the quickest compiler, but that's more complex to set up (not natively integrated into the Apple ecosystem) and appears to only support inference. My idea should accelerate a majority of use cases, and is much more realistic than overhauling S4TF's internals.

I have scanned symbols present in Apple's compiled TensorFlow PluggableDevice executable, and it's the same set of operators as the archived MLCompute fork, minus CPU support. They didn't put any effort into composing 3D convolutions using 2D ops, even though a lot of research into LiDAR depends on 3D CNNs.

philipturner commented 2 years ago

I clarified my current plans for GPU acceleration in the MetalXLA repository.

philipturner commented 2 years ago

I gave one of my own repositories the same fate as Swift for TensorFlow (just for kicks): MetalFFT

Edit: and possibly ARHeadsetKit (research paper not released yet)

philipturner commented 2 years ago

I have a question. I found a reference to the bug TF-1179 while translating @marcrasi's Jupyter kernel from Python to Swift. I couldn't find anything on the JIRA bug tracker website. Is there a way to search for bugs created by S4TF?

BradLarson commented 2 years ago

@philipturner - Many of those old TF- issues got converted into SR- ones. Unfortunately, the TensorFlow sub-project that corresponded to those TF- issues was deactivated, and that appears to have shut down the redirects from the previous TF- issues to the new SR- ones. I don't currently know a great way to find out what SR- issue was for what TF- issue. Maybe there's a way to do this, but I haven't discovered it yet.

philipturner commented 2 years ago

I just made a major realization that I may have to replace the entirely of CTensorFlow with a custom framework (XLAKit) that can substitute it on Apple platforms and (maybe) Windows. I just want to make sure that this won't be a copyright violation.

I don't expect any of this work to be merged into swift-apis/main for several months, but could you create a branch for me to merge into just for PR validation by the original S4TF team? This would be like the tensorflow branch of apple/swift.

philipturner commented 2 years ago

Responding to @BradLarson: https://github.com/tensorflow/swift-apis/pull/1184#issuecomment-1004255045

Both can be prevented if you build a non-assertion-enabled toolchain.

I'm fine with building S4TF's frontend in debug mode for now, and the backend in release mode. As described below, I must build this on a stock toolchain and can't rely on a workaround like disabling assertions. Do you think the release-mode bug will be fixed in a few months, in time for my Metal backend to be complete?

By "release toolchain", do you mean the ones integrated with Xcode, or the swift.org release toolchains? In either case, the 5.5 branch does not have some necessary autodiff fixes in it that have been put in place over the last year

I was using the toolchain bundled with Xcode on my M1 Mac, and the one from Swift.org when on Google Colab. Both are 5.5.2, which was released on December 13. I haven't seen any commits related to autodiff since early December, so have you been implementing bug fixes in a hidden branch of Apple's repository? You filed new bug reports on JIRA in late December (I assigned a new bug to you btw), but no commits to apple/main reflect that work.

Is it possible to work around the bugs present in the December 13 release toolchain and not present in the dev toolchain? I can slightly diverge my fork to accommodate for differing workarounds.

If you're trying to build this on an M1, I'm not sure that all even works with the version of TensorFlow you'd need to build to connect to the old Swift for TensorFlow interfaces

I have an Intel Mac mini and might try building once on that just to reproduce the building process. I can target Rosetta in Xcode on my M1 Max MBP, but I haven't looked up how to do that on command-line (and anticipate unexpected problems even if I do try). I'm ultimately going to remove the CTensorFlow/x10 backend on macOS, so I don't need to compile TensorFlow 2.4 for ARM.

you could try to manually place the path to the correct toolchain in your PATH ahead of the Xcode toolchain in order to force the use of the right toolchain.

I'll try what you suggested and see if I can at least compile code using the 5.6-dev toolchain (even if REPL won't work). I'm temporarily using the dev toolchain only to reproduce what you're currently doing, then modifying my fork to work with release. I planned to move from dev to release on my Mac, then work on Colab support using release. I'm using a Colab VM because it's easier for me to work with than Docker.

Here are all the platforms I'm targeting with the resurrected S4TF:

Google Colab:

x86_64 only
Swift release toolchain (I can try making the dev toolchain work too)
CTensorFlow/x10 backend

Linux outside of Colab:

No active effort to support this platform (I don't have the time resources to do this, and I particularly don't care for bringing first-class support to NVIDIA GPUs)

Windows:

x86_64 only because Swift doesn't support ARM64 Windows
Swift release or dev toolchain
CTensorFlow/x10 backend, in the far future switch to a custom one based on DirectML with no option for training models on CPU

macOS:

x86_64 under emulation is okay, but must compile on ARM
Swift release or dev toolchain
New backend, although I can temporarily rely on TensorFlow

iOS (for training, NOT inference):

ARM64, iOS 15.x
Swift release toolchain
No PythonKit
New backend, as TensorFlow does not run on iOS (TensorFlow Lite doesn't count because it's only inference)
Must run in iPad Swift Playgrounds for people who use an iPad instead of a laptop

Google Colab is the only platform that must use TensorFlow; the others can work with my new backend.

philipturner commented 2 years ago

I tried really hard, but I encountered an insurmountable problem with using development toolchains on Colab. I absolutely must build S4TF with the release toolchain.

It was a miracle that I even got it to build on the release toolchain in the first place, because there were so many places I anticipated I’d encounter insurmountable problems.

philipturner commented 2 years ago

@BradLarson @brettkoonce @ProfFan I still experienced the problem with the build freezing on my Intel Mac mini. There are problems with the Bazel SwiftPM system that make it impossible to build, and the CMAKE system freezes while executing the cmake --build out command. Did you experience this problem?

The freeze seems to happen most often while linking a shared library for PythonKit. After using a pre-compiled executable, I got the following error in the place where it froze previously:

ninja: build stopped: subcommand failed

When going for the option 2 for building on SwiftPM instead of CMAKE (using the pre-compiled x10 executable), I received the following compiler crash on both tensorflow/swift-apis/main and @ProfFan's fork: log.txt

philipturner commented 2 years ago

I'm going to have to use Colab for the rest of today. Is there any difference between release and dev toolchains besides including autodiff that affect the build process?

BradLarson commented 2 years ago

There's maybe a slight misunderstanding about what's present in what version of the Swift toolchain, so I can try to detail the general setup. The nightly snapshots are based on the head of apple/swift, and contain the latest versions of pretty much everything (with a slightly lagging but known stable version of LLVM). The nightly snapshots are created when the Swift CI goes green, so they're known points where the compiler should build and pass almost all tests. The release points correspond to the swift-DEVELOPMENT-SNAPSHOT-XXXX tags here, so when building the compiler it's easiest to check out head at one of those tags and make sure your other checkouts match the timestamp of the relevant tag.

The nightly snapshots include the latest fixes and features, but may also include regressions, so when working with an experimental feature like autodiff you need to test and find specific snapshots that contain all the fixes you need and none of the bugs that may impact your project. You can download old snapshots that correspond to one of the above-linked tags by editing the URL in the snapshot downloads to correspond to the appropriate date of the desired snapshot.

The Swift.org versioned Releases (5.5.2, etc.) don't come from apple/swift head. Instead, they are pulled from a stable branch (release/5.5 for the current 5.5.x releases) that branched off of head a while ago (in release/5.5's case, that was some time early in 2021), so they do not have all the features and patches that have since gone into Swift head. Limited bug fixes and feature additions (like concurrency) have been cherry-picked into the stable branch from head, but only after review. None of the recent autodiff fixes have been cherry-picked into the 5.5 stable branch, to my knowledge.

The Xcode stable release toolchains are based on the versioned releases, and are further stripped down to remove some experimental standard library additions, like the Differentiation functions and types. They are the only ones that can support iOS deployment, however.

Because the versioned Release Swift.org and Xcode stable toolchains lag so far behind in terms of autodiff support, they've largely been impractical for us to use for our differentiable Swift applications internally. What we've done is to pick specific nightly snapshots that are known-good and use those until ready to step up to another known-good nightly snapshot. As examples, we had used the 2021-07-07 nightly for a while, then recently migrated to the 2021-11-12 nightly, and are also supporting the 2021-12-23 nightly snapshot.

Without a build log, it's hard to tell what's failing during the CMake TensorFlow + swift-apis build, but a first guess would be that Bazel is exhausting available resources in memory or hard drive space. In the past, I've had to limit the number of parallel Bazel jobs on systems with less than 16 GB of RAM via its config files.

In the case of the SwiftPM build, which toolchain snapshot is causing the assertion failure that you provided the stack trace for? That looks like a typechecking issue that might be fixed in the 2021-12-23 nightly snapshot.

philipturner commented 2 years ago

First, I need to find a snapshot that both has the bug fixes and works on Colab. The reason 2021-12-23 is incompatible is because it lacks the Python LLDB API. I tried transplanting the LLDB executable from the release toolchain (5.5.2 from Swift.org) into the dev toolchain, but after all my efforts, I ended up with a "SuccessWithoutValue()" error. The executable names are different (liblldb.so.10git on release and liblldb.so.13git on dev). I now realize that I can trace this error's source by logging checkpoints within the Swift Jupyter kernel's code, but transplanting is still a hacky solution. When I tried using the existing liblldb.so.13git, I encountered a Python error because some LLDB function (delete_SB[something I can't remember]) didn't exist in the newer version anymore.

You once got Swift-Jupyter to work on a toolchain that wasn't an official release. I examined the google/swift-jupyter source code and it references the same Python LLDB package from the toolchain's /usr/bin/python3/dist-packages directory as is in the current release toolchain (but absent from dev). Is there a way to build a replica of the nightly toolchain that includes this API? @marcrasi you might be able to help me out on this.

I got the stack trace this morning on my Mac mini. I was using the toolchain bundled with Xcode 13.1 (5.5.1), which I just updated to Xcode 13.2 a few hours ago. The other toolchain I tested was 2021-12-23, downloaded from Swift.org. I tested on the dev toolchain first, then switched back to the Xcode toolchain to reproduce the crash. I lost the terminal window that gave the precise conditions that created the crash, but it happened in the middle of code compilation. It was building on SwiftPM and I don't think it involved Bazel because I was linking the pre-compiled executable on swift-apis' Development.md. I was using the Swift code from @ProfFan's modified fork, which might have conflicted with the pre-compiled executable (which I am yet to build successfully in any environment). The Mac is a 6-core i7 with 16 GB memory.

I'm also struggling to build Bazel in a Colab VM because of a problem with resolving symbolic links. I'll get back to you once I investigate why this is breaking more thoroughly. Before I tried Bazel on Colab, I tried using CMake but faced problems with upgrading past the built-in 3.12.0 (I need at least 3.17.0). I'm not 100% sure that's what ultimately prevented me from using CMake though.

philipturner commented 2 years ago

@BradLarson I think I can work around the toolchain problem on Colab. Python LLDB is only needed for the interactive notebook experience, but I could instead just download Swift source files and compile them using bash commands. That would allow compilation just for the purpose of testing.

philipturner commented 2 years ago

I am unable to link S4TF's build products to an executable. I am using the 2021-11-12 snapshot because later ones have a compiler crash right now (see https://github.com/tensorflow/swift-apis/pull/1184#issuecomment-1008006067). It says:

<unknown>:0: error: missing required module '_NumericsShims'

I did not try running the tests yet, but my primary concern is that this means I can't run any Swift code that imports TensorFlow besides its built-in package tests.

I tried working around it by passing sub-directories of the SwiftPM build products folder, but nothing changed. Then, I downloaded Swift Numerics and compiled it separately. The second version doesn't work because Swift Numerics doesn't make any of its build products dynamic libraries, so no .so files exist in its SwiftPM build products folder (nothing for me to link).

I might work around this by making a Swift package that imports S4TF, then passes the special build flags (shown below) as the swiftSettings parameter of a target. Finally, I would link the .so file produced by compiling that package. Is there any better way to link S4TF to an executable from the command line?

S4TF-specific build flags:

--verbose -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN \
    -Xcc -I/Library/tensorflow-2.4.0/usr/include \
    -Xlinker -L/Library/tensorflow-2.4.0/usr/lib

Colab notebook: CompilingSwift2 (7).ipynb.zip

compile.sh in my s4tf-colab-experiments repo at the time of creating this comment:

export PATH="/opt/swift/toolchain/usr/bin:$PATH"

products_path="/content/swift-apis/.build/x86_64-unknown-linux-gnu/debug"

swiftc -DDEBUG $1 \
  -L $products_path -I $products_path \
  -lTensorFlow #-lTensor -lx10_optimizers_optimizer -lx10_optimizers_tensor_visitor_plan

script.swift in the same repo:

import TensorFlow

struct TensorFloatWrapper {
    var data: Tensor<Float>
}

print(TensorFloatWrapper.self)

philipturner commented 2 years ago

Also, the tests fail with the following error immediately after they finish building (same toolchain as the previous comment):

/content/swift-apis/.build/x86_64-unknown-linux-gnu/debug/TensorFlowPackageTests.xctest: error while loading shared libraries: libx10.so: cannot open shared object file: No such file or directory

I used the command given at the very end of Development.md, but modified it to include -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN because otherwise it would fail to compile. I tried copying libx10.so from the Library folder to the top-level directory (/), but the error still happened.

ProfFan commented 2 years ago

You need a LD_LIBRARY_PATH env appended.

philipturner commented 2 years ago

The test suite executed on 2021-11-12 with zero failures!

philipturner commented 2 years ago

I'm at a loss for how to compile S4TF in a way that I can run anything besides its tests. I tried making a package that imports S4TF, similar to libjupyterInstalledPackages in the Jupyter kernel. I have no clue what I'm doing when playing around with linker settings in the Swift package manifest. For now, I'll make a branch of S4TF, delete its tests, and replace them with custom code.

philipturner commented 2 years ago

I'm commenting here because I'm not sure of the best place to ask this question. In "swift/include/swift/SILOptimizer/Differentiation", there is a file called PullbackCloner.h, but no file called DifferentialCloner.h. Does that reflect the state of forward-mode differentiation being behind reverse-mode differentiation?

philipturner commented 2 years ago

@rxwei (citing the comment above) I have a closed PR that removed forward-mode differential operators from the public API. Are there any plans to remove those from the standard library before AutoDiff goes into a release toolchain?

tensorflow / swift-apis

Requesting assistance #1185