dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.17k stars 4.72k forks source link

Create nuget package for WASI-SDK #82788

Open directhex opened 1 year ago

directhex commented 1 year ago

Ref. https://github.com/dotnet/runtime/issues/65895

We need a way to ship the WASI SDK to users (as a nupkg). Currently, for builds, we're using binaries shipped out of the WebAssembly wasi-sdk.git releases page.

I'm investigating this process.

wasi-sdk consists of three git submodules:

These are used to build FIVE projects in turn (or three, depending on how you count):

  1. LLVM Clang (lld;clang;clang-tools-extra subsets) with WebAssembly codegen and wasm32-wasi default compiler triple. This needs to be built once for every platform we want to be able to build Wasi apps on
  2. Wasi libc (twice, one threaded one not), using the above Clang, to instantiate a sysroot folder. This only needs to be built once, and can be consumed on any OS
  3. LLVM Compiler-RT (lib/builtins folder specifically, i.e. to generate libclang_rt.builtins-wasm32.a). This only needs to be built once, and can be consumed on any OS
  4. LLVM libc++. This only needs to be built once, and can be consumed on any OS.
  5. GNU config. This isn't compiled, a few files are just copied around.

The existing metabuild system in charge of dispatching the above builds is GNU Make. This needs replacing with something we support, e.g. MSBuild (we have experience using MSBuild as a metabuild system, e.g. for LLVM or ICU)

There are a few approaches we can take on how best to get this built.

  1. Treat wasi-sdk as its own standalone thing, which means at least another full-scale build of LLVM. We have changes we would want to apply to the LLVM repo (like support for overriding the copyright on produced executables), so the easiest route here might be altering the submodule reference to point at our LLVM fork, either a new branch or an existing branch
  2. Try to build a suitable LLVM as part of our existing LLVM build, i.e. add WebAssembly to the list of codegen targets and add any needed binaries to the LLVM.Sdk nupkgs. A major downside is if we add functionality to our existing LLVM binaries, we can't change default settings like "use this target by default and this sysroot by default"), which would alter the behaviour of our binaries vs. upstream wasi-sdk. We might be able to squeeze a completely alternative build out of the existing LLVM repo, which means more disk consumption for end users (who would have duplicate copies of binaries, some from the vanilla LLVM build and some from the special Wasi build). Additionally, our LLVM version does not necessarily match what WASI has integration-tested. The documentation claims it's fine to do so, but we're into uncharted waters
  3. Something more drastic, e.g. build the whole wasi-sdk package out of our llvm-project.git by pulling in the needed files (and not using wasi-sdk.git itself directly at all, since all it provides by itself is a metabuild which we need to rewrite anyway)

I'm starting to lean towards option 2, with a "damn the consequences" approach to command line flags, but I think the discussion is worth having.

ghost commented 1 year ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details
Ref. https://github.com/dotnet/runtime/issues/65895 We need a way to ship the WASI SDK to users (as a nupkg). Currently, for builds, we're using binaries shipped out of the WebAssembly wasi-sdk.git releases page. I'm investigating this process. wasi-sdk consists of three git submodules: * Upstream `llvm/llvm-project.git` * Upstream `WebAssembly/wasi-libc.git` * Upstream `https://git.savannah.gnu.org/git/config.git` (GNU `config.guess`/`config.sub` scripts) These are used to build FIVE projects in turn (or three, depending on how you count): 1. LLVM Clang (`lld;clang;clang-tools-extra` subsets) with `WebAssembly` codegen and `wasm32-wasi` default compiler triple. This needs to be built once for every platform we want to be able to build Wasi apps on 2. Wasi libc (twice, one threaded one not), using the above Clang, to instantiate a sysroot folder. This only needs to be built once, and can be consumed on any OS 3. LLVM Compiler-RT (lib/builtins folder specifically, i.e. to generate `libclang_rt.builtins-wasm32.a`). This only needs to be built once, and can be consumed on any OS 4. LLVM libc++. This only needs to be built once, and can be consumed on any OS. 5. GNU config. This isn't compiled, a few files are just copied around. The existing metabuild system in charge of dispatching the above builds is GNU Make. This needs replacing with something we support, e.g. MSBuild (we have experience using MSBuild as a metabuild system, e.g. for LLVM or ICU) There are a few approaches we can take on how best to get this built. 1. Treat wasi-sdk as its own standalone thing, which means at least another full-scale build of LLVM. We have changes we would want to apply to the LLVM repo (like support for overriding the copyright on produced executables), so the easiest route here might be altering the submodule reference to point at our LLVM fork, either a new branch or an existing branch 2. Try to build a suitable LLVM as part of our existing LLVM build, i.e. add WebAssembly to the list of codegen targets and add any needed binaries to the LLVM.Sdk nupkgs. A major downside is if we add functionality to our existing LLVM binaries, we can't change default settings like "use this target by default and this sysroot by default"), which would alter the behaviour of our binaries vs. upstream wasi-sdk. We might be able to squeeze a completely alternative build out of the existing LLVM repo, which means more disk consumption for end users (who would have duplicate copies of binaries, some from the vanilla LLVM build and some from the special Wasi build). Additionally, our LLVM version does not necessarily match what WASI has integration-tested. The documentation claims it's fine to do so, but we're into uncharted waters 3. Something more drastic, e.g. build the whole wasi-sdk package out of our llvm-project.git by pulling in the needed files (and not using wasi-sdk.git itself directly at all, since all it provides by itself is a metabuild which we need to rewrite anyway) I'm starting to lean towards option 2, with a "damn the consequences" approach to command line flags, but I think the discussion is worth having.
Author: directhex
Assignees: -
Labels: `arch-wasm`, `discussion`, `os-wasi`
Milestone: -
dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

steveisok commented 1 year ago

/cc @akoeplinger @radical

akoeplinger commented 1 year ago

The existing metabuild system in charge of dispatching the above builds is GNU Make. This needs replacing with something we support, e.g. MSBuild (we have experience using MSBuild as a metabuild system, e.g. for LLVM or ICU)

We could also add another metabuild layer on top and just call the Makefile from MSBuild :)

I'm also leaning towards option 2.

ghost commented 1 year ago

Tagging subscribers to this area: @directhex See info in area-owners.md if you want to be subscribed.

Issue Details
Ref. https://github.com/dotnet/runtime/issues/65895 We need a way to ship the WASI SDK to users (as a nupkg). Currently, for builds, we're using binaries shipped out of the WebAssembly wasi-sdk.git releases page. I'm investigating this process. wasi-sdk consists of three git submodules: * Upstream `llvm/llvm-project.git` * Upstream `WebAssembly/wasi-libc.git` * Upstream `https://git.savannah.gnu.org/git/config.git` (GNU `config.guess`/`config.sub` scripts) These are used to build FIVE projects in turn (or three, depending on how you count): 1. LLVM Clang (`lld;clang;clang-tools-extra` subsets) with `WebAssembly` codegen and `wasm32-wasi` default compiler triple. This needs to be built once for every platform we want to be able to build Wasi apps on 2. Wasi libc (twice, one threaded one not), using the above Clang, to instantiate a sysroot folder. This only needs to be built once, and can be consumed on any OS 3. LLVM Compiler-RT (lib/builtins folder specifically, i.e. to generate `libclang_rt.builtins-wasm32.a`). This only needs to be built once, and can be consumed on any OS 4. LLVM libc++. This only needs to be built once, and can be consumed on any OS. 5. GNU config. This isn't compiled, a few files are just copied around. The existing metabuild system in charge of dispatching the above builds is GNU Make. This needs replacing with something we support, e.g. MSBuild (we have experience using MSBuild as a metabuild system, e.g. for LLVM or ICU) There are a few approaches we can take on how best to get this built. 1. Treat wasi-sdk as its own standalone thing, which means at least another full-scale build of LLVM. We have changes we would want to apply to the LLVM repo (like support for overriding the copyright on produced executables), so the easiest route here might be altering the submodule reference to point at our LLVM fork, either a new branch or an existing branch 2. Try to build a suitable LLVM as part of our existing LLVM build, i.e. add WebAssembly to the list of codegen targets and add any needed binaries to the LLVM.Sdk nupkgs. A major downside is if we add functionality to our existing LLVM binaries, we can't change default settings like "use this target by default and this sysroot by default"), which would alter the behaviour of our binaries vs. upstream wasi-sdk. We might be able to squeeze a completely alternative build out of the existing LLVM repo, which means more disk consumption for end users (who would have duplicate copies of binaries, some from the vanilla LLVM build and some from the special Wasi build). Additionally, our LLVM version does not necessarily match what WASI has integration-tested. The documentation claims it's fine to do so, but we're into uncharted waters 3. Something more drastic, e.g. build the whole wasi-sdk package out of our llvm-project.git by pulling in the needed files (and not using wasi-sdk.git itself directly at all, since all it provides by itself is a metabuild which we need to rewrite anyway) I'm starting to lean towards option 2, with a "damn the consequences" approach to command line flags, but I think the discussion is worth having.
Author: directhex
Assignees: -
Labels: `arch-wasm`, `discussion`, `area-Infrastructure-mono`, `os-wasi`
Milestone: -
directhex commented 1 year ago

OK, so, first real crossroads in option 2:

We need to build three things from the llvm-project.git repo:

  1. clang and llvm tools
  2. compiler-rt
  3. libcxx

I have successfully built 1 from dotnet/llvm-project - version 14 - and 2/3 from wasi-sdk/src/llvm(llvm/llvm-project) - version 15. It seems to work, for now.

We can't just build compiler-rt and libcxx out of our llvm repo build, because we need to link against wasi-libc (so we would need to submodule wasi-libc into our llvm fork, and we're deep on a road to madness here). So our options are:

  1. Pull in wasi-libc to our llvm repo, so we can build compiler-rt and libcxx in there
  2. Redirect the submodule in our wasi-sdk fork to dotnet/llvm-project so we build those projects out of our source tree but as part of the wasi-sdk repo (so the version numbers match between clang/llvm and compiler-rt/libcxx)
  3. :shipit: just build upstream compiler-rt/libcxx out of wasi-sdk, using our clang/llvm builds, which works but does have the version mismatch
  4. Put the compiler-rt/libcxx sources in a nupkg in llvm-project, so we can consume them as nupkgs in wasi-sdk and get the matching version sources into our wasi-sdk build tree that way
directhex commented 1 year ago

Redirect the submodule in our wasi-sdk fork to dotnet/llvm-project so we build those projects out of our source tree but as part of the wasi-sdk repo (so the version numbers match between clang/llvm and compiler-rt/libcxx)

Option 2 here is the current planned route.

lewing commented 1 year ago

I think we are close here @steveisok