r-wasm / webr

The statistical language R compiled to WebAssembly via Emscripten, for use in web browsers and Node.
https://docs.r-wasm.org/webr/latest/
Other
848 stars 67 forks source link

Add recipe for oneTBB #427

Open andrjohns opened 4 months ago

andrjohns commented 4 months ago

Adds recipe for building oneTBB following their instructions for WASM.

Let me know if I've missed anything, thanks!

andrjohns commented 4 months ago

@georgestagg apologies for the ping, but would it be possible to have a look at this PR? I'd like to fix building/linking for packages using RcppParallel and the TBB

andrjohns commented 4 months ago

Does building RcppParallel work as expected on your machine with this change? If so, what OS are you using?

If we can fix or work around the duplicated symbol problem when compiling RcppParallel while linking to this Wasm build of TBB, I'll be happy to merge this PR.

Oh interesting, I've been using pkg-config for setting the linking/include flags in RcppParallel (this branch here) so I hadn't seen this.

It looks like the pkg-config entry only specifies -ltbb for --libs, which is why it wasn't an issue for me. I can open a PR to add the changes to RcppParallel once this is in, and also open an issue upstream with the TBB to see if there's a known fix or similar?

georgestagg commented 4 months ago

Ah ha! That makes sense. So I suppose this is a RcppParallel problem in trying to link both versions of the library when cross-compiling for Emscripten, rather than a problem with TBB itself.

OK, I will try again with your patched version of RcppParallel when I am back in front of my computer tomorrow. If that all goes well, I'll merge this PR and then we can send a patch upstream to get the Emscripten compatibility into RcppParallel.

andrjohns commented 4 months ago

Brilliant, thanks!

andrjohns commented 4 months ago

Also opened an issue upstream with oneTBB for the linking issue: https://github.com/oneapi-src/oneTBB/issues/1392

andrjohns commented 4 months ago

@georgestagg I found a discussion about a similar error where a flag for disabling the build of itt_notify was recommended. I've added it to the recipe here and it's fixed the linking error for RcppParallel (hooray!)

Note that the RcppParallel install will subsequently fail at the install.libs.R step since it only looks for dynamic libs, but you can try my branch where I've added this for WASM.

One question about the changes for RcppParallel - would it be possible to update the rwasm::build/rwasm::add_pkg shims/environment to automatically set TBB_INC and TBB_LIB? I'm setting the paths in RcppParallel at the moment as part of the configure step, but that feels pretty fragile to future changes

georgestagg commented 4 months ago

I've added it to the recipe here and it's fixed the linking error for RcppParallel (hooray!)

Great! That looks like a reasonable solution to me.

Note that the RcppParallel install will subsequently fail at the install.libs.R step since it only looks for dynamic libs [...] would it be possible to update the rwasm::build/rwasm::add_pkg shims/environment to automatically set TBB_INC and TBB_LIB?

Adding TBB_INC and TBB_LIB to the build environment provided by rwasm is possible, but I'd like to avoid it if possible since the oneTBB library is marked as optional. I think the right way to do this is using pkg-config to detect the library at configure time and handle both the cases with and without TBB.

I also think we can skip the copy in install.libs.R, since the library is linked statically at build time anyway?

I've made some further minor tweaks to your RcppParallel patches at r-wasm/RcppParallel. This version uses pkg-config to find TBB under Emscripten, setting TBB_LIB and TBB_INC if it exists. If this version looks good to you, we can submit a PR to RcppParallel.


I have now been able to successfully compile r-wasm/RcppParallel@webr, linking to -ltbb -ltbbmalloc. I gather that eventually you want to compile andrjohns/StanEstimators, is there a Wasm branch somewhere? Currently rwasm::add_pkg("andrjohns/StanEstimators") gives me:

error: Unsupported machine word size.
georgestagg commented 4 months ago

Ah, I see. The error remains because my host version of RCppParallel has not changed. It does not link to oneTBB and includes headers such as tbb/include/tbb/tbb_machine.h without Emscripten support.

In other words, this change does not fix StanEstimators. At least, not by itself. Coordination is required on the cross-compiling machine. Perhaps this needs some more thought.

andrjohns commented 4 months ago

Ah, I see. The error remains because my host version of RCppParallel has not changed. It does not link to oneTBB and includes headers such as tbb/include/tbb/tbb_machine.h without Emscripten support.

Hmm that is a tricky one. Could the recipe install both Emscripten static libs and regular shared libs, such that setting TBB_INC and TBB_LIB is valid for both the host and cross-compiled RcppParallel?

georgestagg commented 4 months ago

The vendored tbb headers in RCppParallel will also need to be modified to recognise __EMSCRIPTEN__ as a valid Unix-style machine, and the modified headers updated on the host machine independently to the rwasm build environment.

I updated the r-wasm/RCppParallel@webr fork to do just that, then ran pak::pak(r-wasm/RCppParallel@webr) to update my host version, and now StanEstimators compiles OK.

However, at runtime I still see:

Could not load dynamic lib: /usr/lib/R/library/StanEstimators/libs/StanEstimators.so
Error: bad export type for '_ZTIN3tbb4taskE': undefined

I wonder if StanEstimators should be linking to oneTBB directly, rather than through RCppParallel, since we set -fvisibility=hidden by default under Wasm.


BTW: Somehow the BH R package (a prerequisite here) is 121MB!! That is not a reasonable download. I'm not sure what's happened there, but it's unrelated to this particular issue. Will investigate another day...

andrjohns commented 4 months ago

The vendored tbb headers in RCppParallel will also need to be modified to recognise EMSCRIPTEN as a valid Unix-style machine, and the modified headers updated on the host machine independently to the rwasm build environment.

The vendored headers shouldn't need to be touched - since setting TBB_INC results in the system headers getting copied into RcppParallel's inst directory

I wonder if StanEstimators should be linking to oneTBB directly, rather than through RCppParallel, since we set -fvisibility=hidden by default under Wasm.

That could indicate an issue with the config, since all packages that depend on RcppParallel use a call to RcppParallelLibs() in their Makevars which should be returning the flags for linking against the system TBB.

I can have a look at the RcppParallel config tomorrow

georgestagg commented 4 months ago

The vendored headers shouldn't need to be touched - since setting TBB_INC results in the system headers getting copied into RcppParallel's inst directory.

I'd rather not require users to reinstall their host RCppParallel package and enforce that they set TBB_INC on their own system. Ideally, everything should be independent and cross-compiling packages for Wasm should not depend on the host system version of RCppParallel at all, but oh well - that is where we are right now.

I can have a look at the RcppParallel config tomorrow

OK. I'll return here if anything else occurs to me.

andrjohns commented 4 months ago

It's not the 'cleanest' solution - but could the build call check for a dependency on RcppParallel and if found, once the dependencies are installed then temporarily replace the include/tbb dir and libtbb & libtbbmalloc files with the system libs?

That way the host RcppParallel is only modified for the cross-compilation, and no re-installation or user-intervention needed

andrjohns commented 4 months ago

In general, how does the cross-compilation handle situations where a package needs to link against another package and include headers from the other package? Can it be configured to link/include against a cross-compiled/wasm package, instead of a host x86_64 package?

georgestagg commented 4 months ago

Can it be configured to link/include against a cross-compiled/wasm package, instead of a host x86_64 package?

It cannot currently. The rwasm package uses the host R installation, overriding Makevars so as to cross-compile. Including C headers from the host-installed version of a package listed in LinkingTo is handled by R itself (source here).

Including the host version of include headers has not caused issues up until this point, probably because for most R packages we've encountered they do not differ over architectures in this way. However, I can see on my machine that when the system RcppParallel includes are added the undefined symbols appear.

$ nm call_stan_orig.o | grep tbb4taskE
         U _ZTIN3tbb4taskE

OTOH manually compiling the object without -I'/[... my host library...]/library/RcppParallel/include' does not include the incorrect symbol:

$ nm call_stan.o | grep tbb4taskE

I am thinking about how to possibly override this, but I cannot easily see a way to stop R adding in those include arguments. I'm beginning to think your prior intuition was the right way to go after all: having the host RcppParallel also link/include the exact same TBB implementation, complied for x86 rather than Wasm. That way the Wasm and host architechture headers would be more compatible.

I think the best way to do this would be to include oneTBB in the official webR development docker image, set the TBB_LIB and TBB_INC env vars, and build the RcppParallel and StanEstimators packages there.

Sorry that it seems we're going around in circles, but I really want to make sure we get this right. I have to work on some other things now, but I'll come back to this when I can.

andrjohns commented 4 months ago

Yeah the different symbols are caused by different headers unfortunately - RcppParallel includes the older "TBB" library, and there were some significant API changes when Intel changed to "oneTBB" - so some symbols are included in the older headers and not in the newer ones (and vice-versa)

Sorry that it seems we're going around in circles, but I really want to make sure we get this right.

Not at all! Linking with the TBB is a universally painful experience

I have to work on some other things now, but I'll come back to this when I can.

No worries, no rush on my end!

andrjohns commented 4 months ago

Another one of the Stan devs just let me know that the current oneTBB has a bug which affects static/WASM initialisation, so this will have to wait until the next release anyway

StaffanBetner commented 3 months ago

There was a new release of oneTBB the day before yesterday.

andrjohns commented 2 months ago

A possible solution for the linking/building issue would be to change rwasm::build to use a temporary site-library when building packages - such that package dependencies which can be used safely 'as-is' are simply copied across, and dependencies for which the cross-compiled version are needed (RcppParallel) are then installed to that library first