Script access to changed/added/removed files

aomarks commented 2 years ago

Since Wireit knows which input files were changed, added, and removed between each run of a script, it makes sense to allow scripts to directly access that information to help them run incrementally. Any program which takes a list of files on stdin or in argv could benefit from this feature.

For example, an eslint script could be configured to only lint the files that were changed or added since its last successful run, and run much faster (eslint actually has a feature like this built-in, but using it with Wireit means duplicating all the work of computing changed files).

Here are a few ideas for how we could allow scripts to access changed file data:

Environment variable

"lint": {
  "command": "cat ${WIREIT_FILES_CHANGED} ${WIREIT_FILES_ADDED} | xargs eslint",
  "files": [
    "src/**/*.ts"
  ],
  "output": []
}

This would work by creating .wireit/<script>/(changed|added|removed) files before executing a script, and setting the $WIREIT_FILES_(CHANGED|ADDED|REMOVED) environment variables to those paths.

Pro: Seems like the simplest solution.

Con: We'd write these files even they aren't being consumed (though it could require opting-in with a setting).

Binary

The Wireit binary itself could print changed/added/removed files when it is called with a particular parameter. As usual, it would use the npm_lifecycle_event environment variable to figure out the context.

"lint": {
  "command": "wireit files changed added | xargs eslint",
  "files": [
    "src/**/*.ts"
  ],
  "output": []
}

Con: Extra binary invocation. To prevent duplicating delta calculations, we would want to coordinate between the main and child wireit processes, probably by writing the changed files to the .wireit/<script>/(changed|added|removed).

Stdin pipe

"lint": {
  "command": "xargs eslint",
  "files": [
    "src/**/*.ts"
  ],
  "stdin": {
    "files-changed": true,
    "files-added": true
  },
  "output": []
}

Pro: Syntax is the same on Windows vs Linux/macOS, since Windows environment variable and pipe syntax is different. I don't think we should make decisions based on this though, this is just a perennial problem with how npm supports multiple shells.

Con: Less flexible than the other options. The other options allow specifying exactly where and how the input files are read in the shell command. Feels like the most complex in terms of the configuration syntax.

aomarks commented 2 years ago

One problem with this approach is that the files list often contains both the source files to a program, and its configuration files.

In the case of eslint, we don't want to pass eslint's own configuration files to eslint.

We would also need some way to slice/filter the files, then.

pesterhazy commented 2 years ago

Wouldn't it be possible to pass the list of files changed as command line parameters to the linter? That's how treefmt does it: https://github.com/numtide/treefmt/blob/master/docs/formatters-spec.md#command-line-interface

aomarks commented 2 years ago

Wouldn't it be possible to pass the list of files changed as command line parameters to the linter? That's how treefmt does it: https://github.com/numtide/treefmt/blob/master/docs/formatters-spec.md#command-line-interface

Yep, that's what all of these examples show; that's what xargs does. I was thinking a file would be better than putting the filesnames directly into environment variables, because there could be a lot of them. But maybe that's fine.

cefn commented 1 year ago

Couldn't there be a commandFiles key that is merged with files for wireit freshness, but which limits the values passed to a command to a subset of those watched (typically not including the config files).

jpzwarte commented 1 year ago

I would love this as well. I'm using esbuild and at the moment it just compiles all source files. Would be great if it could only compile the modified ones.

jpzwarte commented 1 year ago

@aomarks Couldn't you detect whether $WIREIT_FILES_* is used in the command and only set them if they are?

pspeter3 commented 1 year ago

I like @cefn's approach which makes it clear which files are used for the command versus the build.

anoblet commented 1 year ago

We could use this too. Our development environment requires us to build SCSS, and ideally we would like to only run the command on files that have changed.

sensingturtle commented 8 months ago

+1 for this. My projects require me to compile multiple LESS files to CSS and this feature would make the main lessc compiler even more powerful. Thanks for all the work :)

deebloo commented 6 months ago

I would vote for either an opt in environment variable or stdin. the environment variable seems to be the most flexible but stdin feels the most "standard". This does feel like one of the last big pieces I would really like to see

deebloo commented 2 days ago

@aomarks curious if there has been any more thoughts here or there would be guidance if folks wanted to start working on this

google / wireit