aspect-build / rules_js

High-performance Bazel rules for running Node.js tools and building JavaScript projects
https://docs.aspect.build/rules/aspect_rules_js
Apache License 2.0
300 stars 103 forks source link

[Bug]: `js_binary` can't find transitively-dependent modules (Node 19.8+) #1546

Open vpanta opened 5 months ago

vpanta commented 5 months ago

What happened?

Upgraded to Node 20.11.1 and running a js_binary target started erroring with ERR_MODULE_NOT_FOUND for libraries the which were transitively depended upon via an included js_library. We expected transitive dependencies to be appropriately forwarded to the binary.

Upon investigation, this worked in Node 19.7 and started failing in Node 19.8.

Version

Development (host) and target OS/architectures: MacOS Sonoma 14.3.1 on an M1 machine

Output of bazel --version: 6.0.0

Version of the Aspect rules, or other relevant rules from your WORKSPACE or MODULE.bazel file: aspect-build/bazel-lib: 2.4.2 aspect-build/rules_js: 1.37.1

Language(s) and/or frameworks involved: Node 19.8+

How to reproduce

You can reproduce by using https://github.com/vpanta/node_20_deps_issue and setting the Node version in the WORKSPACE to 19.8 or later.

Any other information?

Conversation in Bazel's Slack: https://bazelbuild.slack.com/archives/CEZUUKQ6P/p1708720525262059 Seems to be related to https://github.com/nodejs/node/pull/45258's addition of new fs APIs

alexeagle commented 5 months ago

Node 19.8 added a new fs API https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V19.md#2023-03-14-version-1980-current-targos and we don't patch that in https://github.com/aspect-build/rules_js/tree/main/js/private/node-patches so it seems like an obvious new sandbox escape.

jbedard commented 5 months ago

It seems to me that the behaviour after upgrading NodeJS is correct, and I think the issue is actually the fact that it was working previously.

The example repo configures the rules in a way that the :node_modules/mathjs dependency does not get passed through correctly, so it should not be available at runtime, yet it somehow works with the older version of node.

Does that sound right?

pat-trunk-io commented 5 months ago

I actually would think the previous behavior was the one that is expected? If you use mathjs, I don't want to have to pull it's dependencies in my build file, it should be available transitively. Otherwise, every time I add a package dependency, I have to flatten all it's dependencies and add them to my BUILD file. That seems wrong?

adamscybot commented 1 month ago

Using latest RC rules_js, latest RC rules_ts and latest rules_nodejs. And Bazel 7.2.

I'm having this issue. Transitive deps of third party packages declared in my project can not be found when using node 20. Node 18 works fine. I have a PNPM monorepo setup. I have a setup where there are two apps and one share package. And, of course, the workspace root. All of those call npm_link_all_packages.

The shared package declares an npm_package and has :node_modules in its data. This package has almost all of the third party dependencies in its package.json. Nothing outside this package directly references those dependencies.

The app packages are just ts_projects/js_libraries/js_bin targets. They depend on the npm_package from the shared package via its :node_modules/@shared/thing target. In their package.json they declare a workspace:* dependency on the shared package. I also tried using an absolute target from the repo root, to no avail.

In the workspace root package.json it does not have a dependency to any of the other packages via workspace:*. I messed around seeing if I could coerce it this way and didn't get far and stopped trying. Besides, that is not a normal thing to do in a PNPM workspace unless you intend to publish the workspace and it doesn't represent the direction my dep tree makes sense with. There are some limited deps in this workspace root though.

This has led to almost a week of debugging and much confusion. I have actually arrived at what I thought was a "working" setup. The details of this are esoteric and I will try to find the time over the next few days to describe it (but I may arrive at the answer first). But it's partly to do with how ts_project interacts with everything else, possibly when using TSC transpiler in both the apps and the shared package.

Actually, I suspect what I have now is accidentally working and possibly covering up the "real" problem that I originally thought was all to do with this ts_project thing, but I am increasingly believing is actually to do with some issue around the latest rules and node 20 when used in a PNPM workspace setup like this.

It's notable to me, the test cases do not have examples as far as I know of a setup where you have shared packages that output declarations and JS files, and then all of this together with more TS in the app layer. This is a real world scenario, and I think the test cases in this project are too on-the-nose to pick up on problems like mine.

However, this was all only to later find for some devs the missing module occurs temperamentally. And to get local working again, you need to do some deleting in bazel-out, which I realise is a bad sign. The missing module seems to change as well for some people.

I've recently started to wonder if its in part to do with when launching both apps simultaneously (I haven't triggered the issue yet just running one app build at a time yet). This makes me wonder if reuse_sandbox_directories being switched on (by default in Bazel 7) interacts badly. It also makes me wonder about https://github.com/bazelbuild/bazel/issues/22867.

Compounding this idea of some kind of race conditions on symlinks is that when I go to manually follow up from the requesting package inside bazel-out that imports the "missing package" to do a dry-node-resolve in my head, I can see the symlinks to the transitive module from the requesting package's node_modules. It's there, in the correct place. Which is very confusing. And the symlink resolves to the location I would expect in in the aspect_rules "pnpm"ish cache.

I need to next try disabling the FS patch tomorrow, which I've only just become aware of, and I have got a feeling will work. But unsure why these imports would be rejected by the rules_js FS patch. I've checked for ESM vs commonjs weirdness. And I'm including my package.jsons such that they bubble through for ESM detection to work correctly.

The exact cause is very very hard for me to pin down. And I suspect all of these things are interacting.

Its led me to quite a lot of confusion. And this thread has raised alarm bells I've been doing it all wrong. Is the intention that I have to add the transitive dependencies of every 3rd party module as direct dependencies of my project? As that would be thousands? Surely the answer is no, but I'm second guessing.

And now assuming the answer is no. Should I expect the 3rd party dependencies of my shared workspace package to be available to js_binary targets in my app packages (which are not npm_packages themselves), when they depend on the npm_package of that shared package which also declares all those 3rd party deps as data?

I again assume that is how it is is supposed to work, because otherwise all the (hundreds) of deps of the internal shared package would then bubble up to the apps which is not a normal workspace setup and goes against the logical model of workspaces in general (outside bazel at least).

I realise I haven't given enough core details for you to arrive at a sudden fix for me. But I have more investigating to do, and I'm interested in the answer to the above. I feel like I'm close, and if I arrive at the magic bullet, I will find further time to produce a minimal repro that brings together pnpm workspaces, shared internal packages & typescript. In fact, I think the project could do with this sort of "mega testcase" both as a catchall and an indicative example.

adamscybot commented 1 month ago

Turns out my issue was a flag: https://github.com/aspect-build/rules_js/issues/1877