emscripten-core / emsdk

Emscripten SDK
http://emscripten.org
Other
2.92k stars 660 forks source link

Version of `wasm_cc_binary` that doesn't untar the result? #1382

Closed jez closed 1 month ago

jez commented 1 month ago

I'm working on updating our build pipeline to use the upstream version of the emscripten bazel toolchain (our current implementation of using Bazel to drive emscripten predates having upstream support).

Our current build pipeline publishes the *.tar file produced by emscripten and downloads it elsewhere in the build pipeline. After changing our pipeline to use wasm_cc_binary, which is nice because of the platform transition it does, the output is now a multitude of files, which we have to tar back up again.

It would be great if there were an option or another function I could call that only did the transition, and didn't extract the tarball.

walkingeyerobot commented 1 month ago

I'm apprehensive about exposing the tarball. The tarball functions as a workaround for the fact that cc_binary can only produce exactly one output, which we then hide from the user through wasm_cc_binary. Tarring and then immediately untarring output is an implementation deal I'd like to change if and when bazel / cc_binary allows for it, so I'd rather people not depend on that particular detail.

A few solutions come to mind:

  1. Building with --config=wasm will result in a tarball. However, you mentioned you liked the benefits of the transition, which that would avoid, so that's likely not a solution.
  2. You could wrap the output of wasm_cc_binary in a genrule to just re-tar it. It seems wasteful to tar and untar and retar.
  3. We could add an additional output file on wasm_cc_binary. This might not be useful to anyone else, but I don't think that's a good enough reason to not add it. I would hesitate here because it necessarily doubles the total size of the output of wasm_cc_binary for all cases, which might matter for some users. Maybe we can do this but add an attribute that controls this?

Right now I would recommend number two. I think it's entirely reasonable for you to wrap wasm_cc_binary with a bazel macro that retars its output. This would also be resilient against changes to wasm_cc_binary that remove the tarball step entirely. I'm happy to help write this macro if you'd like.

cc @trybka, maybe he has better ideas

trybka commented 1 month ago

Re: #3 we can put it in a custom output group s.t. it isn't built by default, but then you could put a filegroup which reads the custom output group in front. That makes it so that we can avoid doubling the output size by default, and in the future where we don't tar/untar we can keep the custom output group by tarring ourselves (and it would only be invoke if/when a user requests it).

walkingeyerobot commented 1 month ago

Thanks! I made #1383 to do just that

jez commented 1 month ago

Thanks for the fast turnaround! I'm excited to be using the upstream emscripten Bazel files. Years and years ago we hacked an emscripten toolchain together to build our project, but the person who wrote it left and no one else on the team is particularly great at Bazel, so I'm ecstatic that there is upstream support for Bazel in emsdk 🙇‍♂️🙏