Closed stinodego closed 1 year ago
This is a good plan, and happy to help out on the mypy side.
I am not too familiar with most of the flake8 plugins. My only concern with adding these would be that if some plugins add a number of false positives, there is quite some additional work to opt-out particular lines of code again, increasing the developer burden. Plugins should also be actively maintained of course (for when we are upgrading to newer Python versions, etc).
yesqa
also seems useful, note that there is an open feature request in flake8 for this functionality: https://github.com/PyCQA/flake8/issues/603 But given that has been open for some time, using yesqa
seems sensible.
This is a good plan, and happy to help out on the mypy side.
Great! I opened a PR to prepare the incremental adoption of mypy strictness features.
I am not too familiar with most of the flake8 plugins. My only concern with adding these would be that if some plugins add a number of false positives, there is quite some additional work to opt-out particular lines of code again, increasing the developer burden. Plugins should also be actively maintained of course (for when we are upgrading to newer Python versions, etc).
Having a more opinionated formatter (in the form of these plugins) can be a great help, but we should indeed be mindful of selecting plugins that work for this code base (i.e. do not throw a lot of false positives). I think we should just try out a few of these, see what the effect would be on the code base to satisfy them, and then decide. I'll gladly create a PR for each of these.
With regards to being actively maintained, I believe these plugins are mostly modern and wisely used. Let's indeed make sure we're happy about the state before adopting each plugin.
@stinodego; good stuff - the only thing I'd really disagree with here is Set max-line-length = 88
.
This is just waaaay too short for the modern era of widescreen high-resolution displays.
Recent projects I've worked on that enforce line-length (typically also via black
if in python) have set it to between 100-120, which has proved to be a decent compromise between the 1980s expectation of working in a green-screen 320Γ200 CGA terminal and the 2020s reality of retina screens in your pocket and 4K, 5K, 6K (!) monitors. And if 120 is a bridge too far for some, let's at least go for 100, heh.
FYI: 100 is what the linux kernel set as their new max back in 2020, up from 80. You will probably not be surprised that this update came along with some fruity commentary from Linus, heh: "I do not care about somebody with an 80x25 terminal window getting line wrapping ... For exactly the same reason I find it completely irrelevant if somebody says that their kernel compile takes 10 hours because they are doing kernel development on a Raspberry PI with 4GB of RAM."
For the curious - the typical/historic 80 char line width originated from IBM punch-cards - not a joke - from which early terminals/teletypes/etc took their cues ;)
(132 was also a popular line length at one point, due to it being the max width for later terminals...)
the only thing I'd really disagree with here is
Set max-line-length = 88
.
I definitely see your point. Most work projects I do are set to 100 characters, although personally I prefer to stick to black
's default 88. What I don't like about longer lines is that they tend to wrap when I have two files open side by side, like when reviewing code or writing tests.
What mostly triggered me here is the fact that black
and flake8
are misaligned: we get 88 characters of code through black
, and then some docstrings/comments are nicely wrapped at 88 characters, others are a bit more inconsistent, and then others are waaaay longer. Plus it looks weird when the code is wrapped at 88 but the docstrings are much longer.
I am fine with setting line length to 100 for black, and then matching it with flake8.
@ritchie46, do you have a preference here?
PS: Awesome picture, thanks for the nice historical reference π
I like the current line lengths. In my setup I have half my screen minus file tree for code and I almost never have to scroll horizontally this way.
I like the current line lengths.
In that case, I propose we stick to 88 characters. We can follow the advice outlined here though, where black
suggests to set flake8
to 88 characters, but allow some slack on the enforcement of the 88 characters. This is done by replacing flake8's E501 with flake8-bugbear's B950.
I think that could be a nice compromise! Then the guideline is clear, but we have some wiggle room.
Has anyone here used pylint before, and a good idea of added value over flake8? I have tried it on our code base for a couple files, see #4100. There were definitely some useful suggestions, as it seems to be much stricter than flake8, which has its pro's and cons.
I get that shorter line lengths mean less than optimal vertical organization, and maybe I missed it in reading through here, but why not follow PEP with 79?
I get that shorter line lengths mean less than optimal vertical organization, and maybe I missed it in reading through here, but why not follow PEP with 79?
Reading this explanation by black on why they default to 88 characters should tell you all you need to know. The video they link to is very inspirational, as well.
Has anyone here used pylint before, and a good idea of added value over flake8? I have tried it on our code base for a couple files, see #4100. There were definitely some useful suggestions, as it seems to be much stricter than flake8, which has its pro's and cons.
I admit that I have limited experience with pylint, but here's my hot take:
I believe most open source projects use flake8 over pylint (and I have absolutely 0 numbers to back up that statement π). The reason is that pylint has a lot of lints that are very subjective and debatable. You can already see it in your PR #4100 and the reaction by @ghuls (which I agree with). So it is not necessarily more strict, it's just more opinionated.
And it is opinionated by default, so you will have to spend time configuring it to your liking. Which is fine, but the result is that every contributor has to get used to your subjective way of doing things. Which might be different from the other projects they are working on.
flake8 is super basic by default. It just makes sure you're mostly following pep8. And then you can extend it from there by making it more strict. That makes it more suitable for open source repos, in my opinion.
I would be in favor of sticking to flake8 and the corresponding ecosystem, for now at least.
I agree with @stinodego. In my experience flake8 is faster to run, which helps with iteration and build times.
A little late to this conversation...
I've had great success with pre-commit / autoflake/ isort / black. Here's a sample YAML...
Somewhat related, but maybe out of scope, I think the guys over at Vector have a beautiful commit history.
I added a conventional commits enforcement, which you can see a sample of at https://github.com/dragonflydb/dragonfly/blob/main/contrib/scripts/conventional-commits
Happy to help if it's wanted.
Hi @ryanrussell ! I have talked to @ritchie46 before about using pre-commit, and he's not a fan. So, long story short, this will not happen in the near future.
I definitely see the value of pre-commit: it has some nice lints, and in the YAML it's easy to quickly see the lints you need to conform to. There is a reason many open source projects use pre-commit.
Then again, there are some lints that need to run outside of pre-commit. mypy
is one of them, due to some specifics on how the pre-commit environment works (it won't find all errors). I am not sure of the Rust side of things (does pre-commit have miri
?). So if we need a make command anyway that runs various linting tools, then what is really the added value of pre-commit?
The conventional commits enforcement looks cool, but this project always does squash & merge in the GitHub UI, so I'm not sure how that would work.
So for now, we're sticking with the workflow where you just run a make command to make sure your stuff lints before you open a PR. And if you forget, it'll get caught by the CI, and you'll have to fix it afterwards.
I believe most open source projects use flake8 over pylint (and I have absolutely 0 numbers to back up that statement π). The reason is that pylint has a lot of lints that are very subjective and debatable. You can already see it in your PR #4100 and the reaction by @ghuls (which I agree with). So it is not necessarily more strict, it's just more opinionated.
And it is opinionated by default, so you will have to spend time configuring it to your liking. Which is fine, but the result is that every contributor has to get used to your subjective way of doing things. Which might be different from the other projects they are working on.
Agreed on being less used and very opinionated (and with @ghuls on the if-else), and it is more a system of opt-out, where you turn off stuff you don't like. Rather than the other way around.
flake8 is super basic by default. It just makes sure you're mostly following pep8. And then you can extend it from there by making it more strict. That makes it more suitable for open source repos, in my opinion.
I would be in favor of sticking to flake8 and the corresponding ecosystem, for now at least.
Agreed, but I do wonder whether there is a middle ground, as there is plenty of stuff that flake8 misses out on by default (see some of the PR's you are raising currently), which I think most would agree is sensible. But then again, that may be highly subjective, and I am only hoping there to be something that doesn't exist.
I am a little wary of us ending up having to fight multiple extensions breaking on each new Python release. But maybe that is just me being conservative with adding dependencies, and we should just add them liberally and remove if the extension becomes outdated and/or a burden to fix.
[...] we should just add them liberally and remove if the extension becomes outdated and/or a burden to fix.
That's the way I look at it personally! We can add a lint if it helps us fix some issues in the code base. And if it causes any friction, we'll just get rid of it. No harm done.
You did get me interested in diving more into pylint, so I might backtrack on this. But as I said, let's stick with flake8 for now!
I have looked into setting mypy's warn-return-any
. This one throws a lot of errors, because all the Rust bindings we use return the Any type.
As an example, a simple method like DataFrame.estimated_size
gives problems. Consider the following example:
import polars as pl
df = pl.DataFrame({"a": [1, 2, 3]})
reveal_type(df._df.estimated_size()) # Any
reveal_type(df.estimated_size()) # builtins.int
The backend call returns an Any type, as it is not typed to Python standards. We return the result as an int
, hence the mypy error.
This happens many, many times in the code base. I don't feel it really improves the code quality if we litter our codebase with # type: ignore
statements.
Curious to hear your thoughts on this, @matteosantama and @zundertj !
Yes, I noticed the same thing. I still think it is a worthwhile endeavor, but we should use cast
instead of type: ignore
.
There seems to be some movement on the PyO3 side to natively support the generation of stub files (see https://github.com/PyO3/pyo3/issues/510 and https://github.com/PyO3/pyo3/issues/2454). If we use cast
on our side, then when this feature becomes available mypy
will either warn us of a redundant cast OR an invalid cast (attempting to cast an int to a str, for example). If we instead use type: ignore
instead, mypy will warn us about unnecessary ignore comments, but any mismatches will be silenced.
The PyO3 documentation suggests manually creating .pyi
files, but I don't think this is necessary for our use-case. All our Python-to-Rust calls are hidden internally, so maintaining a .pyi
file doesn't have much benefit over individual casts (besides maybe centralizing the annotations).
Most of our code is just an adapter on the underlying PyObject
.
def estimated_size(self) -> int:
return self._df.estimated_size()
Do you want to rewrite all such functions to the version below?
def estimated_size(self) -> int:
return cast(int, self._df.estimated_size())
That just makes things more unclear, in my opinion. In which way would this be worthwhile?
Most of our code is just an adapter on the underlying
PyObject
.def estimated_size(self) -> int: return self._df.estimated_size()
Do you want to rewrite all such functions to the version below?
def estimated_size(self) -> int: return cast(int, self._df.estimated_size())
That just makes things more unclear, in my opinion. In which way would this be worthwhile?
I agree.. this seems like a lot of clutter to me.
How much lints have we that are not inner binary polars related?
@stinodego : didn't even see this until now, guess we figured out the problem with turning on this warning at the same time :)
The PyO3 documentation suggests manually creating
.pyi
files, but I don't think this is necessary for our use-case. All our Python-to-Rust calls are hidden internally, so maintaining a.pyi
file doesn't have much benefit over individual casts (besides maybe centralizing the annotations).
As I also commented in PR #4415, this amounts to duplicating almost all type annotations, as our Python api is a thin wrapper around PyO3 to start with. I.e, our api is currently already mostly type annotations, and some minimal conversion code in places. I don't think we should add another layer that is just the type annotations.
Perhaps this is tangential to the Python lints, but within py-polars
we house several Rust files. As part of the CI we run cargo clippy
, which is a CLI tool similar to flake8
, and there are currently 110 warnings.
We could clean up the warnings and instead run cargo clippy -- -D warnings
, which fails when a warning is encountered. Do people think this is a worthwhile endeavor?
Perhaps this is tangential to the Python lints, but within
py-polars
we house several Rust files. As part of the CI we runcargo clippy
, which is a CLI tool similar toflake8
, and there are currently 110 warnings.We could clean up the warnings and instead run
cargo clippy -- -D warnings
, which fails when a warning is encountered. Do people think this is a worthwhile endeavor?
I'd be very much in favor. Cleaning up the Rust-Python bindings was on my to do (dispatching stuff to expressions was part of this effort). Conforming to the clippy linter would be a good step to take!
Yes, this is something we must certainly do. It has been on my todo list very long. :sweat_smile:
Looking at it now.
Fixes some: https://github.com/pola-rs/polars/pull/4486
Fixes some: #4486
Those "borrow_deref_ref" warnings are weird - I don't really see how we're even doing that? They're just a simple &str
a lot of the time π
Could be a bug/false positive in clippy. Had that already quite a few times. I shall take a look first.
Edit: Might also be the cause because we have wrapped in #[pymethods]
which is a proc macro, so the actual code will be different. We need to expand the macros for that to check.
I think we're good on Python lints for now. Closing this issue. Feel free to add new lint ideas here or open a new issue.
As the project becomes more popular, we can expect more people to start contributing to the code base. Having a good linting setup will make sure our code quality remains consistently high, while aiding in the code review process. I outlined a number of tools/settings that I think will help. Suggestions are more than welcome.
flake8
max-line-length = 88
# noqa: E501
.flake8 plugins
flake8 has a rich plugin ecosystem with additional lints that can help keep your code clean. They can be enabled simply by adding them to our build requirements. Using these, flake8 becomes more like the programming buddy that
cargo
is for Rust. Below is a list that I recommend (loosely in order of importance):All of the following find legitimate issues in the existing code base:
if TYPE_CHECKING:
blocks in order to minimize import overhead.~ Cannot use this right now due to requirement of Python 3.8 and up.We will skip the flake8 lints below for now. They have minimal impact.
The following should be nice to enforce, but we are currently compliant:
Literal
) are valid in your supported Python versions.~The following I am not sure about, but might be useful:
mypy
.~mypy
I would like to set
strict = True
for mypy in order to improve reliability and quality of our type hints. This currently produces 1157 errors in 38 files. Thestrict
flag is a combination of multiple strictness-related flags. I recommend we enable these one-by-one and fix the related errors.Other helpful CLI tools
These can be added as additional commands in the CI pipeline.
warn-unused-ignores
. It makes sure all the# noqa
comments are actually necessary.~ Not worth incorporating in the CI right now.