macrome-js / macrome

The in-tree build system
MIT License
7 stars 1 forks source link

Feature: support stable id macro generators #32

Open conartist6 opened 2 years ago

conartist6 commented 2 years ago

I think there's room to build a really cool feature here which fits with my general goal of exposing the behind-the-scenes magic that makes most modern javascript run. The idea is basically to build support for continuous codemods. These would continuously evaluate on source files and would have the opportunity to react to changes in files. The main purpose of this would be to insert stable identifiers into the code.

In practice I've seen a few kinds of expressions that would benefit from being tagged with stable ids: errors and translations. Each will likely need to be tracked and cross-referenced across multiple versions of a codebase, which may even have structural differences. For translations the most stable possible ID is valuable since churn in the unique key of the translation likely leads to paying translators to redo translations that are already done. For errors the advantage is in aggregating errors with the same fundamental cause in order to give engineers good data which will permit statistical analysis and integration across a range of products.

Here's what I imagine this looking like:

import { uid } from 'unique.stable.macro';

throw new CodedError('test', { code: uid`` });

When a file with those contents was saved, a generator would react and attempt to generate changes. unique.macro would find the empty uid tagged template string and would fill it with some unique identifier, resulting in something like this:

import { uid } from 'unique.stable.macro';

throw new CodedError('test', { code: uid`001F` });

Any further evaluations of the generator and the macro must not write the file at this point, or an infinite loop would be caused.

conartist6 commented 2 years ago

I think better terminology for this is a reentrant generator. A reentrant generator would understand whether it had already made a particular change, and would then avoid making it again. A once-and-only-once codemod could actually be a specific case of this general scenario where the codemod would overwrite a central file updating it with a value indicating that the codemod had run. I think it makes excellent sense for a codemod to store its state by modifying code. If this mechanism was embraced it would also mean excellent support for providing such functionality in non-js languages.

conartist6 commented 2 years ago

So what would the API requirements for a reentrant generator look like? It would need hashes to understand if the contents of a file had changed. We already have a system for grabbing mtime through watchman, and our standalone build process. We could add hashes to mtimes (or maybe eliminate mtimes?? not sure) and guarantee the stability aspect for all generators, which would be a big win actually even beyond codemods.

I believe it should be generally illegal for map to overwrite its own source unless the generator opts into such behavior by specifying a codemod: true property. This would have the benefit of making the intent of such generators much more apparent to us, and might allow us to do better validation and messaging. Perhaps in a more general case it should also be illegal for a generator to overwrite a file known to be generated by a different generator, as this is also likely to violate our system design assumptions in ways that may not be immediately apparent.

conartist6 commented 2 years ago

Also RE the previous once-and-only-once codemode, a codemod generator itself would be another use case for a stable id, which could then be used in the places that track whether (and even perhaps where) I given mod had been applied. This would give mods the freedom to change path or name, which are other ways we might try to key them for reference elsewhere.

conartist6 commented 2 years ago

If macrome succeeds there codebases using it will contain more than one checked in transform of a given file. If there are two copies of the same file and both contain uid`001F`, then duplication of the macro call could lead to evaluations which make it appear as if the uniqueness constraint is violated. This could be dealt with in configuration by excluding the non-source-of-truth files from processing with the uniqueness generator, but I think we could probably do better by having the macro behave in such a way as to eliminate itself when not running inside a generator that is codemod: true. The result of evaluating macros in a non-codemod generator would then be:

throw new CodedError('test', { code: `001F` });

This suggests a need for some amount of communication between a generator and a macro evaluation context, which indeed we have a mechanism for that we use in async.macro.