use :naive parallelization for eastwood

thumbnail commented 3 years ago

Brief

Sets eastwoods parallel? to :naive for gains ⚡

QA plan

Run the following snippet in projects with both v4.3.0 and a local install with this change

(require '[formatting-stack.strategies :as strategies]
         '[formatting-stack.linters.eastwood :as linters.eastwood]
         '[formatting-stack.protocols.linter :as protocols.linter])
(time
  (protocols.linter/lint!
    (linters.eastwood/new {})
    (strategies/all-files :files [])))

for formatting-stack this yields a 2x improvement (39s -> 17s)

Author checklist

[x] I have QAed the functionality
[x] The PR has a reasonably reviewable size and a meaningful commit history
[ ] I have run the branch formatter and observed no new/significative warnings
[x] The build passes
[x] I have self-reviewed the PR prior to assignment
Additionally, I have code-reviewed iteratively the PR considering the following aspects in isolation:
- [x] Correctness
- [x] Robustness (red paths, failure handling etc)
- [x] Test coverage
- [ ] Spec coverage
- [ ] Documentation
- [ ] Security
- [x] Performance
- [ ] Breaking API changes
- [ ] Cross-compatibility (Clojure/ClojureScript)

Reviewer checklist

[ ] I have checked out this branch and reviewed it locally, running it
[ ] I have QAed the functionality
[ ] I have reviewed the PR
Additionally, I have code-reviewed iteratively the PR considering the following aspects in isolation:
- [ ] Correctness
- [ ] Robustness (red paths, failure handling etc)
- [ ] Test coverage
- [ ] Spec coverage
- [ ] Documentation
- [ ] Security
- [ ] Performance
- [ ] Breaking API changes
- [ ] Cross-compatibility (Clojure/ClojureScript)

vemv commented 3 years ago

Please verify exactly what Eastwood :parallel? does - I suspect it's dangerous. I've had a negative experience with parallel application of tools.analyzer:

https://github.com/clojure-emacs/refactor-nrepl/pull/247#issuecomment-561009122

thumbnail commented 3 years ago

Please verify exactly what Eastwood :parallel? does

This change has been verified and there's a QA which has seen extensive testing locally. the :parallel? option parallelizes after the analysis is done. I'd've opted for a partitioning-pmap solution if lint-namespace was public API.

Absolutely try for yourself, always interested to see if it works as it should for everyone :)

vemv commented 3 years ago

Ok, I'll state the facts myself:

:parallel? causes a parallel invocation of lint-namespace https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/lint.clj#L391-L396
lint-namespace calls lint-ns https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/lint.clj#L353
lint-ns calls analyze/analyze-ns https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/lint.clj#L162
And accordingly the analyze-ns ns uses tools.analyzer: https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/analyze_ns.clj#L13

This change has been verified and there's a QA

QA and tests can only prove the absence of known flaws. Please accept the fact that tools.analyzer, just like clojure.core/require and clojure.core/eval need a non-trivial amount of care to be thread-safe.

Evaluating an arbitrary set of namespaces in parallel and no specific dependency order is the opposite of that.

thumbnail commented 3 years ago

Thanks for bringing this to our attention.

Because the issue you describe doesn't occur in several large internal projects I did some digging. It turns out that the change works fine in our testing because the codebase is reloaded beforehand, so no (new) code is evaluated by running Eastwood. Probably eliminating the issue.

Formatting stack is designed to work on reloaded code. So while this shouldn't be an issue, I don't feel comfortable applying this change without a construct to ensure that has happened.

This means that using :parallel? should be opt-in, archivable with future work on configuration.

For the :exclude-linters change; without a way to 'replace' a part of the default-config, the consumer can no longer turn these linters on. This is too pending on future work on configuration.

The last change (removing extra keys in reporting) has little effect (unless debugging the linter). So it'll be postponed until more work is needed on this piece.

vemv commented 3 years ago

It turns out that the change works fine in our testing because the codebase is reloaded beforehand, so no (new) code is evaluated by running Eastwood.

Not sure of how you invoke f-s these days, but in my case I tend to use its :in-background? true option.

So a rapid succession of (refresh :after format!) calls can create a race condition.

This is unrelated to the :parallel? option though. Anyway if you are curious, the root cause is that t.n lacks a global lock, which is a flaw in itself no matter what. I'm working on a drop-in t.n replacement which features a lock, and can be honored from f-s: wip

Probably eliminating the issue.

A race between t.n and Eastwood was not my original concern to begin with. An isolated Eastwood invocation with :parallel? is dangerous (or at least, not cautious) because clojure namespaces can define arbitrary contents, which aren't necessarily designed to be evaluated in parallel.

You might not experience issues for a year and still you could have them any given day. Also it is extremely hard to try guessing how this will behave in arbitrary codebases, i.e. using just one or two organization's repos can be bit of a small sample size, leading to various biases.

At a deeper level, it's worth reiterating that f-s, as designed, defaults to agnosticism. Its very goal is eliminating futile discussions between developers with different sensibilities. There's no absolute truth (like a :parallel? value), and hopefully that will remain a first-class concept for us! The only limit is when a default stops being cautious in face of reasoning and evidence. (Which doesn't limit configurability, only the default value itself)

So, very obviously I don't have a problem if you set :parallel? true in your own configs, or maybe for handiness you follow the same model I did with https://github.com/nedap/formatting-stack/pull/149 (tldr: Java system prop, leave the default unchanged, explicitly exercise in CI for an indefinite time)

nedap / formatting-stack