nedap / formatting-stack

An efficient, smart, graceful composition of Clojure formatters, linters and such.
Eclipse Public License 2.0
99 stars 2 forks source link

Idea: don't re-analyze files depending on their SHA #169

Open vemv opened 3 years ago

vemv commented 3 years ago

Problem statement

Any given linter (particularly Eastwood) can be slow.

For the use case of "running branch-formatter integrated with the repl", one can observe that one might be linting again files that haven't been modified.

For example:

Proposal

Implement a strategy that removes files if all these conditions are met:

Caveats

It's plausible that fixing a given file's linting result does not depend entirely on the file itself: fixing it might be accomplished by modifying a different file.

That could cause stale caches or such.

However I'm not immediately aware if such a case. Still worth a think, on a per-linter basis.

Applicability

This optimization is only possibly useful in a local dev repl, so out of caution, I would not add the proposed strategy to the stack if (System/getenv "CI").

Other

I think that the cache should be keyed per git-branch and possibly deleted on each detected git branch change anyway. This is a basic caution against the mentioned Caveats.

Alternatives and comparison

Cache linters' reports. e.g. (caching-linter/new (linters.kondo/new {}))

Might perform better.

cc/ @thumbnail

vemv commented 3 years ago

@thumbnail I'm tempted to give a shot to the alternative you suggested. I have the vague feeling that it can be something thin to implement.

IOW, this wouldn't conflict with architectural redesigns - we could always scrap my work, but keep the integration tests.

Assuming the resulting PR is indeed thin, does this plan SGTY?

thumbnail commented 3 years ago

I think a wrapper is the most viable option. Simply using lastModified as a cache key to prevent files from being linted, and merge previous/cached reports in the end result.

I think the branch/project formatter shouldn't use the cached versions by default, that way CI doesn't either.