volta-cli / rfcs

RFCs for changes to Volta
Other
17 stars 14 forks source link

Per-project configurations #33

Closed charlespierce closed 5 years ago

charlespierce commented 5 years ago

Support Volta configuration of hooks on a per-project basis using a .volta/hooks.json file in the project.

Rendered

charlespierce commented 5 years ago

@rwjblue I think the performance penalties are definitely something to keep in mind, but I'm not sure a notion run command covers the case of wanting a single source-of-truth for the node version that can be used for build caching as described in that issue.

Additionally, while it's definitely true that parsing extra files has a cost, in this case at least, the cost is limited to one extra parse, and not N like in the case of parsing the dependent package.json files. We also already have to do the discovery of the package.json, and we can require that the new file be a sibling of that so as not to have to double-up the speculative file-structure walking.

Also worth noting: Because we have event reporting hooks in hooks.toml, we actually do parse that file on every invocation (though for the general case it's not parsed until after the command has run), so in practice there won't be a difference between the two (though that can certainly change if we decide the performance hit is too great).

rwjblue commented 5 years ago

Because we have event reporting hooks in hooks.toml, we actually do parse that file on every invocation

Sorry, why do we need hooks.toml for every invocation of a tool (either notion install'ed or node/yarn)? I'm not intimately familiar with what we store there that would be needed...

dherman commented 5 years ago

@rwjblue For event reporting, e.g. when you've hooked up Volta to log diagnostic information for internal reporting/monitoring.

dherman commented 5 years ago

Let me map out some of the constraints first, and then I'll conclude with a proposal.

Constraints

Category: Syntax

Constraint: Refactoring We should try to stick to consistent configuration file syntax in all places that the same kinds of configuration information might live (e.g. project-local vs user) so it's as easy as possible to refactor.

Constraint: Learnability The config syntax should be as readily understandable as possible for JS developers.

Constraint: Comments We should support a way to write comments in configuration files. (Why, Crock, why?!)

Category: Filesystem layout

Constraint: Separate layers and ownership Tool pinning and site-management hooks represent different roles, which will often be performed by different people. The former is managed by the development team when choosing or upgrading tools. The latter is managed by the team responsible for policies around where tools are provisioned from, which is often an IT or infrastructure org.

For learnability, readability, and independent maintenance, it’s best to keep configuration relating to hooks in separate files from configuration relating to pinned tools.

Constraint: Low adoption overhead For smaller open source repos, a mandatory extra top-level file or directory would impose cognitive overhead and be resented by people considering adopting Volta.

Constraint: Future-proofing for more configs We should future-proof against the possibility of ever adding multiple Volta files to the top-level, even if they're opt-in.

Constraint: Fast path performance Complex filesystem lookups could impose latency costs on fast paths. The design should not conflict with Volta's fast-path performance goals.

Constraint: Single source of truth If we allow multiple possible representations for identical information (e.g. package.json and a project-local config file), we should at least not allow inconsistent data to coexist in more than one location.

Constraint: Workspace support It should be possible to support workspaces (aka monorepos) by mirroring the Yarn workspace algorithm, allowing configuration and tool pinning to be shared across multiple projects within a workspace.

Constraint: Ready-made cache key for Docker It'd be nice to have a standardized file in source control that contains your basic tool versions and nothing else, so that the file can be used as a cache key for layer caching in a Dockerfile. This allows Docker users to just COPY the file from source control directly in their Dockerfile and then install the tools that correspond to it, and the cache will never be invalidated as long as those tool choices don't change.

Otherwise, the way to provide that data as inputs is by extracting the information with shell commands and passing it along either through generated files, environment variables, or Docker build arguments. But not every system that provides support for Dockerfile has the right hooks to allow performing the shell commands at the right point, which then ends up requiring users to do manual steps and keep up to date with the source of truth.

Proposal

Putting these considerations together brings me to:

Syntax (now)

The RFC I think correctly identifies that JSON is the most familiar config syntax to use. This doesn’t allow comments, but see the next section. I think for now, the amount of config is small enough that lack of comments isn’t a high priority issue.

Syntax (future)

I think we can just add support for comments. JSON + comments is an increasingly common phenomenon, and how much I care what Crock thinks about it + $4.15 will get you a Venti Latte at Starbuck’s. 😛

There are already Serde implementations of syntaxes like HJSON. This would be a backwards-compatible change so we don’t need to block on figuring out exactly which superset of JSON we want to support.

Filesystem layout (now)

Tool pinning should only be supported in package.json for now, maintaining the separation of developer-facing tool pinning information from low-level site configuration.

This doesn’t solve the Docker problem but see the next section. I think for now, there are workarounds people can use in those setups (e.g. automatically generating the cache key file they want as part of their build step, as a git trigger, etc).

Other per-project configuration should go in a .volta directory at project top-level, future-proofing for additional configuration files down the road. For now, the hooks should go in .volta/hooks.json. If we want to commit to a particular human-JSON dialect now, we can choose a different filename extension, but the tsconfig.json precedent suggests to me that it’s OK to stick with .json.

Possible extensions (future)

In the future, we could support the Docker use case by allowing projects to opt in to using a tool lockfile instead of package.json. This would live in .volta/tools.lock. We would want to test for latency regressions, but I suspect it can be made negligible.

I could imagine simple commands like volta pin --lockfile as a way of opting in.

I believe that workspace support should be something we can add without significant incompatibility—basically, the resolution algorithm would search further up for workspace roots, to allow more sharing. So it’s not an immediately blocking issue, but it’s reasonably high priority since workspaces are popular.

chriskrycho commented 5 years ago

One initial question-for-clarification on that proposal:

Tool pinning should only be supported in package.json for now, maintaining the separation of developer-facing tool pinning information from low-level site configuration.

Here, I'm wondering what you mean by "for now"—i.e. for the next little bit while we have other higher priorities? Or indefinitely unless/until we do a further RFC in that direction? My own suggestion would be to include it as a planned extension of the current behavior, but with a known lower prioritization. In that case, we would need a small bit of additional design to specify which "wins" if both are specified.

One thought, related to that:

We would want to test for latency regressions, but I suspect it can be made negligible.

In principle I think these could "just" be run in parallel and so we'd only be waiting on the I/O and parse of the slower of them – almost certainly package.json. (I'm curious if anyone sees a reason we couldn't do that.)

dherman commented 5 years ago

@chriskrycho

My own suggestion would be to include it as a planned extension of the current behavior, but with a known lower prioritization.

I'm not sure I have a strong inclination either way. My only thought is that I'm not confident we've really surfaced all the constraints around why people would want a separate lockfile. Maybe the Docker layer caching use case is enough? I just wonder if it might want some time for more users to onboard and surface use cases and constraints.

charlespierce commented 5 years ago

Updated with limited scope (only hooks) and with an additional critique section about the file format.