Closed allsey87 closed 6 days ago
Note that although all checks are currently passing, this is only testing the pass-through behaviour for the moment. I will add additional checks that test the secondary cache shortly.
@sbc100, @walkingeyerobot, this is now ready for review. I came close to being able to support secondary caches on Windows, but there are some quirks to when Node.js is installed under external
that do not apply to Linux and Mac. I would have liked to have gotten this working, but I have already spent a whole day just on supporting Windows and I really cannot spend any more of my time on this.
The steps for enabing Windows support in its current state are:
emscripten_cache.bzl
test_bazel.ps1
In this state, the test on Windows will fail on line 78 of emscripten_cache.bzl
since Node.js is not available at external/nodejs_windows_amd64/bin/nodejs/node.exe
.
Does this degrade Windows support in any way? Or does building non-prebuilt system libraries simply not work on Windows?
Does this degrade Windows support in any way? Or does building non-prebuilt system libraries simply not work on Windows?
IIUC this doesn't degrade existing features but adds a new feature that simply doesn't work on windows (yet). Seems like its probably worth addressing that before we land.
Does this degrade Windows support in any way? Or does building non-prebuilt system libraries simply not work on Windows?
IIUC this doesn't degrade existing features but adds a new feature that simply doesn't work on windows (yet).
This is correct. Windows still works as normal with the prebuilt cache as long as emscripten_cache
does not contain any arguments.
Seems like its probably worth addressing that before we land.
The only path forward that I am aware of which might solve the problem of Node.js not being available on Windows in time would be to migrate to rules_js as proposed in #1388 since the newer repo uses the bzlmod approach (MODULE.bazel) instead of the WORKSPACE file to set up Node.js. I suspect that the bzlmod initialisation may happen earlier than the WORKSPACE which could mean that Node.js would be available on all platforms by time I need to run embuilder.
Would it be ok to add this on to this PR or should it be in a separate PR?
Let's do that in a separate PR; this one is already kind of beefy.
To be honest I'm surprised you need node_js configured at all to run embuilder... can't you just using a config file that doesn't contain node_js at all when you call embuilder?
Then create a new config file which is a copy of the core on, just with a new CACHE at the end, for use in the rest of the toolchain? Also, it might be possible to just stick with the existing config file and set the EM_CACHE environment variable (for use in the rest of the toolchain that is.. i.e. after the toolchain has been setup).
To be honest I'm surprised you need node_js configured at all to run embuilder... can't you just using a config file that doesn't contain node_js at all when you call embuilder?
Then create a new config file which is a copy of the core on, just with a new CACHE at the end, for use in the rest of the toolchain? Also, it might be possible to just stick with the existing config file and set the EM_CACHE environment variable (for use in the rest of the toolchain that is.. i.e. after the toolchain has been setup).
I just had a quick look into this. It might actually just be the call to check_sanity
in embuilder.py that results in the node version check and requirement on the NODE_JS
variable. Looking at those functions, it might be possible to bypass that check by setting EM_IGNORE_SANITY
in the environment and hence removing the dependency on Node.js...
To be honest I'm surprised you need node_js configured at all to run embuilder... can't you just using a config file that doesn't contain node_js at all when you call embuilder? Then create a new config file which is a copy of the core on, just with a new CACHE at the end, for use in the rest of the toolchain? Also, it might be possible to just stick with the existing config file and set the EM_CACHE environment variable (for use in the rest of the toolchain that is.. i.e. after the toolchain has been setup).
I just had a quick look into this. It might actually just be the call to
check_sanity
in embuilder.py that results in the node version check and requirement on theNODE_JS
variable. Looking at those functions, it might be possible to bypass that check by settingEM_IGNORE_SANITY
in the environment and hence removing the dependency on Node.js...
We shouldn't be checking for NODE_JS untill we actually need it. I can take a look at fixing that on the emscripten side if that is true.
Looks like you can probably just ignore the node version-check warnings if embuilder generates any.
To be honest I'm surprised you need node_js configured at all to run embuilder... can't you just using a config file that doesn't contain node_js at all when you call embuilder? Then create a new config file which is a copy of the core on, just with a new CACHE at the end, for use in the rest of the toolchain? Also, it might be possible to just stick with the existing config file and set the EM_CACHE environment variable (for use in the rest of the toolchain that is.. i.e. after the toolchain has been setup).
I just had a quick look into this. It might actually just be the call to
check_sanity
in embuilder.py that results in the node version check and requirement on theNODE_JS
variable. Looking at those functions, it might be possible to bypass that check by settingEM_IGNORE_SANITY
in the environment and hence removing the dependency on Node.js...We shouldn't be checking for NODE_JS untill we actually need it. I can take a look at fixing that on the emscripten side if that is true.
It is true, the call stack is main -> check_sanity -> perform_sanity_checks -> check_node_version.
Then create a new config file which is a copy of the core on, just with a new CACHE at the end, for use in the rest of the toolchain?
This is already how things are working. The CACHE line is added to the end of default_config
which is written out to emscripten_config
and used by the toolchain.
Also, it might be possible to just stick with the existing config file and set the EM_CACHE environment variable (for use in the rest of the toolchain that is.. i.e. after the toolchain has been setup).
I was experimenting with this approach initially, I will investigate and try to remember why I don't use the environment variable anymore.
Looks like you can probably just ignore the node version-check warnings if embuilder generates any.
Wrong version is a warning. Node.js missing results in a crash.
Looks like you can probably just ignore the node version-check warnings if embuilder generates any.
Wrong version is a warning. Node.js missing results in a crash.
If necessary we could probably fix upstream emscripten so that we don't depend on node JS when running embuilder. But fixing it downstream (here) seems reasonable too.. especially if we want to support older versions of emscripten.
Looks like you can probably just ignore the node version-check warnings if embuilder generates any.
Wrong version is a warning. Node.js missing results in a crash.
If necessary we could probably fix upstream emscripten so that we don't depend on node JS when running embuilder. But fixing it downstream (here) seems reasonable too.. especially if we want to support older versions of emscripten.
So it turns out there are two places where Node.js is being checked:
But it is possible to work around both these checks with a small hack by setting the following environment variables:
EM_IGNORE_SANITY=1
EM_NODE_JS=empty
The empty
bit is the small hack, it can be any non-empty string and is used to get us past the checks inside set_config_from_tool_location
by setting val
to something that isn't "falsy".
And we have Windows support!! I am not sure why the test-bazel-mac-arm64
is stuck at "Waiting for status to be reported" and not starting, but I think that is unrelated to this PR. Perhaps one of you two can start/trigger this check manually?
I noticed today that my headers were still being included from the default cache location instead of the generated cache location. I think that is due to toolchain.bzl#483. Am I correct in thinking this path should be updated to point to the newly generated cache? I guess it doesn't really matter since we are just looking at header files?
The header files should be exactly the same so it shouldn't matter which sysroot is used for headers.
thanks very much! this looks good to me.
@walkingeyerobot @sbc100 I think this PR will need to be manually merged due to a CircleCI gitch (test-bazel-mac-arm64
is just waiting indefinitely).
This is a working solution for generating a separate Emscripten cache. Note that this requires an additional entry in the workspace as follows:
When used like this, the default Emscripten cache will be used. However, if the entry is as follows:
Then embuilder will be called to build all system libraries and ports (i.e., the
ALL
option to embuilder) with the LTO option enabled. This can take awhile, so I have also made possible to specify which libraries you want to build explicitly:Resolves #807, resolves #971, resolves #1099, resolves #1362, resolves #1401