Closed dherman closed 5 years ago
It's natural to question whether pinning for reproducibility is worth the cost of extra fetching.
I think that pinning the node version of an installed binary to the version in that project's platform
key actually does less for reproducibility that it would seem. Specifically, for tools that are used in a project, we are already following the depender's settings for determining the node
version. That is, even if typescript
specifies "node": "11.6.0"
in its package.json
, if we are using typescript
in a project and have node
pinned to 10.5.0
in the project, then we'll be using node@10.5.0
to invoke tsc
, even though that's different from what was specified.
So pinning the user version to the version that was used to develop a tool only applies for user tools that aren't explicitly specified in a given project. Which (should) mean that they have no impact on the reproducibility of builds for that specific project, because if they did have an effect, then you would want to add that dependency to the project and we would be back on the above use case.
Furthermore, there are a couple of similar concepts that are being conflated by using the platform
specifier for both roles: The developer experience (for people developing a tool) and the user experience (for people using the tool in their own work). The current approach would use the same node
version for both use cases, as there's only one platform
entry, but it's not guaranteed that that is the intent of the tool authors. In general, I would imagine that CLI type tools that are expected to be globally installed are tested against multiple versions of node
, and so should be generally able to work in multiple environments.
Lastly, pinning a user tool to the platform specified in that tool's package.json
removes some of the ability for end users to control the platform that they are running on their own machine. There is no way for a notion
user to say "I want this tool to use these platform versions." As mentioned above, the user version of a tool only matters for tools that aren't specified in a project, so these are likely tools that the user has installed for their own general-purpose use. If we, instead of using the tool's platform
specifier, instead use the user's current platform
to pin the tool, then the user can easily control which version of node
is used to execute a tool.
Alternatively, we could even not pin the platform for a user tool at all, and simply use the current user platform whenever the tool is executed. That would open potential issues with compatibility, but I would have to guess that those are relatively rare, since that more closely matches the current status quo. It would also make the Notion user experience easier to understand: If you are running in a project with platform
specified, then we use that project's platform
, if you are running outside of a project then we use your user platform. No caveats about "If you are running a tool, then we will use these other platforms to run that tool specifically, but other tools might be using different platforms so will use different versions." We already have reproducibility for projects covered with the platform
specifier, combined with npm
or yarn
lock files to pin the versions of dependencies, I suspect that doing a lot of pinning for user tools outside of projects might be premature optimization, maybe we should start with the simplest and easiest to explain approach, and re-visit if there are significant problems.
After typing that all out, I think my proposal would be simplification to: If you are in a project, we use that project's platform
for all invocations. If the tool that is being invoked is installed as a dependency in that project, we use the locally installed version, otherwise we use the user-specified version that was installed when the user did notion install
. Lastly, if you are outside of a project, we always use the current user toolchain and the tool version that was installed with notion install
.
To help with any weird corner cases that might come up, I also thought of having a notion run
command that would let the user specify a toolchain at execution time, i.e. notion run --node 11.6 tsc
, which would execute tsc
in a platform that used node@^11.6.0
, regardless of the project or user platforms.
@charlespierce I'm replying to a couple things you wrote above:
Lastly, pinning a user tool to the platform specified in that tool's
package.json
removes some of the ability for end users to control the platform that they are running on their own machine. There is no way for anotion
user to say "I want this tool to use these platform versions."
Pinning the user tool to a specific platform should have no affect on the platform the user is running on their machine. And vice-versa: if the user changes their active platform, it should not break the user tool. The idea is that the user tool is installed like a standalone binary, which is statically linked to a specific platform, apart from whatever platform they have installed.
So in a typical scenario, the user runs notion install ember-cli
for example. So ember-cli
is installed as a user tool, pinned to whatever is specified in "platform"
in package.json, or pinned to whatever their current platform is (which is probably what will happen for a while, since no packages are using "platform"), and a shim is created. Every time ember
is run outside of a pinned project (or in a pinned project where it is not a dependency), it will be run with that pinned platform. So even if the user changes their platform, ember
will continue to work.
The way things are now outside of Notion, whenever you install a new version of node
(with nvm or whatever tool you use), you have to re-install any global binaries, because they no longer work. They were only installed as part of the other node
installation. So pinning them to a single static platform is necessary to prevent this "tool bitrot", which is addressed somewhere in this RFC.
That kind of leads into this as well:
Alternatively, we could even not pin the platform for a user tool at all, and simply use the current user platform whenever the tool is executed. That would open potential issues with compatibility, but I would have to guess that those are relatively rare, since that more closely matches the current status quo.
I don't think there is any way we can guarantee that a package installed using one version of Node will necessarily work when run using a different version of Node, especially between major versions. To keep user tools working as expected we have to statically pin them to a specific platform, until the user does a notion upgrade <tool>
(or whatever the syntax will be).
@mikrostew That's reasonable about Tool Bitrot making pinning the platform a desired aspect so that the tool will always work, regardless of what else changes. I still feel like double-dipping the platform
specifier, to represent both the environment that developers of a tool work in and the environment that they expect end users to work in, has some potential issues.
At the very least it's a potential source of confusion for a user: If I have my user platform set to use Node 11.6.0 and run notion install <SOME_TOOL>
, and then run <SOME_TOOL>
and it executes using Node 8.11.3, I would be confused about where that version came from. We then have to explain that a) The version came from the platform
inside the tool's package.json
and b) There's no way for the user to override it, so that tool will always use node@8.11.3
until the developers of that tool decide to change their package.json
.
@mikrostew wrote:
So pinning them to a single static platform is necessary to prevent this "tool bitrot", which is addressed somewhere in this RFC.
Yeah, I believe this is an important requirement.
I don't think there is any way we can guarantee that a package installed using one version of Node will necessarily work when run using a different version of Node, especially between major versions. To keep user tools working as expected we have to statically pin them to a specific platform, until the user does a
notion upgrade <tool>
(or whatever the syntax will be).
I agree, and in fact I think this is part of making the choice of platform more under the user's control. When I install a tool like surge
with notion install
, I don't want it to start running with, say, an ancient Node platform just because I'm sitting in a project directory that happens to use that old platform.
The way I see this is:
@charlespierce wrote:
I also thought of having a notion run command that would let the user specify a toolchain at execution time, i.e.
notion run --node 11.6 tsc
Yes! I've had this in the back of my head but didn't write it up. I'll mention it in the RFC.
I still feel like double-dipping the
platform
specifier, to represent both the environment that developers of a tool work in and the environment that they expect end users to work in, has some potential issues.
I think this is a fair concern. Maybe it's worth considering an optional syntax allowing a project to specify distinct dev vs prod platforms. By default, "platform"
would just mean both:
"platform": {
"node": "11.8.0"
},
but maybe we could allow you to specify:
"devPlatform": {
"node": "11.8.0"
},
"platform": {
"node": "10.5.0"
}
to be consistent with naming conventions like "dependencies"
and "devDependencies"
.
It's also good to break this down into representative scenarios for each case:
surge
(CLI tool for pushing static files to a free CDN)madge
(a tool for visualizing module graphs of JS projects)tsc
(TypeScript compiler)Going through them case by case:
class
), but I might be using it on the modules of a project that uses an older Node. So again, I want it to keep using the version it was built to work with.I like the potential idea of a devPlatform
, but I would imagine we also want to allow the user to specify devPlatform
without a platform
. It seems like the most likely use-case for wanting to specify a different development and release platform
for a tool is if you want all the devs to use the same version of tooling, but want the tool to work on any version. In that case, you could specify devPlatform
and not platform
, so that devs working on the tool will all get the same version, but when deployed it will just use whatever version the individual user is tied to.
Regardless, however, I don't think it's a major issue or a blocker for MVP in any way, just something to keep in our minds as adoption (hopefully) grows.
There's also a 4th use-case, that at least early on, I suspect will be pretty common: A user with Notion installed is working on an OSS project that doesn't specify a platform. In this case, we treat it the same as the "global context" case: Using the user platform and the pinned platform for any installed tools.
However, if the installed tool is a dependency of the project (e.g. tsc
), we want to execute the package-local version of that tool, using the user platform, not the pinned platform for the global version of the tool. This is because the tool was, presumably, installed by npm install
or yarn
running in the user platform, so any native compilation was done with the user platform, not the pinned platform.
@charlespierce
if you want all the devs to use the same version of tooling, but want the tool to work on any version.
Oh, this made me realize: we actually may want to use something more like "engines"
instead of "platform"
for deciding the Node runtime version to pin to an installed package.
As you say, specifying a precise version is a development concern. For deployment, you probably don’t want to have to be as precise, and you probably want to allow customers to use newer versions of Node than the particular one you chose at the time of shipping.
And that’s really what the "engines"
field is about: the broadest possible range of compatible Node runtime versions you’re compatible with. So there’s really no need to invent a new field.
That also has the benefit of reusing an existing standard and not needing to work so hard to bootstrap the ecosystem.
@dherman I like the idea of using the "engines"
field, as you say that's exactly what it's for. We can then fall back to using the platform
if engines
is specified or can't be parsed. And since the engines
should be a semver range requirement, we already have the machinery in place for parsing and resolving those :+1:
This has been an edifying thread. I’ll update the RFC! I think this also should be enough clarity to help me update the docs for the web site too.
Follow-up from a video discussion a few of us had today: we ended up convincing ourselves we prefer to original "toolchain"
syntax, instead of "platform"
. Here were our main points of rationale:
"engines"
: the "platform" name sounds like a synonym for "engines." The name "toolchain" is less confusable with "engines."LOL, we had yet another discussion about the name of the key, and @charlespierce made the case in notion-cli/notion#412 for "volta"
. I'll update the RFC.
This RFC is also overdue for merging -- I updated a few bits that were out of date but otherwise it pretty much describes the current state of the design, so I'm going to merge.
This RFC supersedes #21 and #22.
Rendered