Closed dsgibbons closed 1 year ago
I think this is a great idea in principle! I love black and ruff, and they could be an enormous help in improving code quality.
One concern though is that reformatting with black
might make it very difficult to merge any changes between forks, as the diff would be enormous and merge conflicts would likely arise everywhere. It would be great if the original repo were already formatted with black, but we have to be pragmatic.
I think keeping the diff between forks minimal & understandable is rather important for us, as we want to make it easy to see what has changed in this fork, and we want to be able to merge PRs that were originally raised on the original repo. Arguably that is more important at the moment than having a nice code style.
So... should we start with just Ruff but not Black? We should certainly fix the logical issues, e.g. ==
vs is
and all that.
I'll chip in here as well. Here are my thoughts:
black
into CI/CD at this stage. I propose that we do that once we are more or less done with porting the PRs and fixing the more pressing issues. That would make our lives much easier.black
later on.C4
rules in the future, e.g. C400. It would save us some effort in the future if we are conscious now NOT to write code that violates C400. I think a happy medium is while keeping CI ruff running the bare minimum checks, we can set a stricter set of rules on our own local development environment. There's no way to enforce this, of course, but it's what I will be doing with my PRs.I agree wholeheartedly, @thatlittleboy .
On your point (4) about linting new changes to a higher standard, one possibility is given in the ruff docs:
When enabling a new rule on an existing codebase, you may want to ignore all existing violations of that rule and instead focus on enforcing it going forward.
Ruff enables this workflow via the --add-noqa flag, which will adds a # noqa directive to each line based on its existing violations.
If we went this route, we'd need to distingush somehow between existing # noqa
directives that have were deliberately added to suppress false positive warnings, versus any new # noqa
directives that are temporary placeholders and may be masking a genuine error.
Thank you for the great feedback @thatlittleboy and @connortann. I agree with your points. I'm definitely in no rush to add comprehensive linting and formatting. I mainly raised this issue so I could cross #2594 and #2595 off from #2.
I like the idea of only applying ruff
to new code. I'm not entirely sure about ruff --add-noqa
though. In addition to the false positive situation mentioned by @connortann, won't all of those additional # noqa
directives clog up the diff w.r.t slundberg/shap
?
For now, I'm happy to discuss a minimal set of ruff
rules that should be applied to the entire codebase inside #25
It seems like we've got consensus to avoid Black for now, and to run Ruff on the whole codebase. I think we can close this issue now that #25 has been merged.
As a longer-term objective, it would be good to get a greater set of Ruff checks passing.
Following PRs #2594 and #2595, consider using
black
andruff
to standardize code style. We may disable manyruff
rules to get started. For example, I don't fancy adding static type checkin to the entire project yet.mypy
would be nice to add one day, and may help with some of SHAP's rough edges.