dosisod / refurb

A tool for refurbishing and modernizing Python codebases
GNU General Public License v3.0
2.48k stars 54 forks source link

Consider fixating mypy to <0.990 instead #40

Closed nth10sd closed 2 years ago

nth10sd commented 2 years ago

mypy released version 0.982, presumably a bugfix. We should consider fixating the mypy version here to be <0.990 instead, since bugfixes shouldn't break anything, or am I incorrect?

dosisod commented 2 years ago

Bugfixes shouldn't break anything, but anything is possible. I think having a single version of mypy is a bit restrictive, though allowing a range of mypy versions which are tested for compatibility with Refurb would be ideal. Setting the version to <0.990 will probably be safe, but from the release/tagging history, "bug fix" releases are somewhat sparse, and Mypy is not updated very often (usually many months, excluding cherry picked bug fixes), so we would be updating the version from <0.990 to <0.1000 about as often as we would be moving 0.990 to 0.991.

To summarize, we should allow a range of mypy versions, but not future/bug-fix releases via <0.990. Mypy versions are spaced out enough to where it shouldn't be too hard to verify each release is compatible or not, and update the version range accordingly.

henryiii commented 2 years ago

I would generally recommend not capping, but rather using a lock file or other locking mechanism (like pre-commit).

dosisod commented 2 years ago

@henryiii , what do you think the best path forwards would be in this regard? Refurb's only hard constraint upon users is the Mypy package (and it's dependencies), and given that Refurb uses the internals of Mypy as opposed to using it the way most users will be using Mypy (solely as a type-checker), I feel it is more important to test that each new Mypy version works.

As I have gathered from your post on the matter, you feel like it should be the users decision whether or not a package should be pinned or not. Is that correct? This is my first major Python project, and so getting versioning right for the end-user sake is paramount, though I still want to maintain some confidence in Refurb's stability.

From what I can tell, there are 3 contexts where Mypy is being used in relation to Refurb:

  1. End user of Refurb:

    • Currently, Mypy is pinned via Refurb
    • This could conflict if they already pin mypy, or have a newer version installed
  2. Plugins for Refurb:

    • Refurb will be pinned/upper bound, which could propagate mypy version conflicts
    • Not many (if any) plugins have been made for Refurb, so I don't have an answer for how versioning works in this regard
  3. Developers of Refurb:

    • Mypy is the heart of Refurb. Refurb uses Mypy's internal API, which from what I can tell, is not exactly anticipated to be used by users (Refurb being the user). I have not had any issues with Mypy's API changing dramatically, though I have only been developing this project for 2-3 months.
    • Mypy is also used for its normal use case: type checking the codebase. Since we cannot have 2 versions of mypy installed, this can cause some issues; For example, pattern matching support is hit-or-miss in Mypy (as of current), such as this issue: https://github.com/dosisod/refurb/blob/4230bbbfa8b191650add38ebfee9682f8e647efe/refurb/checks/string/startswith.py#L37-L43 Preferably, this would be written like this: match extract_binary_oper("or", node):, but that would cause a crash when type checking that file. It would not cause a crash for the end user, only the developer. This means that if we update the code (when it is fixed), we will have to bump the minimum version, despite it only affecting developers. Perhaps there is a way to specify developers need a higher version, while end users can still install a lower version.
    • There is a potential for Refurb to be compiled via mypyc (see #42), which might further complicate versioning

Sorry if that was a lot, but there is a lot to consider when talking about Mypy versioning, and I want to get it right. Curious what suggestions you might have given the situation!

henryiii commented 2 years ago

Ahh, this is more complex than I thought - I wasn't expecting referb to use mypy internals. If users are using a proper tool like pipx or "environment per check" tools like pre-commit, nox, tox, or hatch, then a strict pin here isn't too bad, and it's very much in the "allowed" pins - no pin is nice, but some are basically required, and using an internal private API is a good reason to. Of course, users wanting to install this in a shared "dev" environment (like Poetry pushes), then this is problematic, but probably unavoidable. Quickly releasing after mypy releases (and maybe also checking against the master branch in your CI too) is probably the best you can do. You might have to judge from experience about allowing new patch releases. With internal usage, you are justified in not allowing it. Also be sure to test your whole range!

Also, please do not use ^ syntax. It's a (IMO terrible) Poetry thing, and it's going away in 2.0 anyway. They will finally follow PEP 621 then, like PDM, Hatch, Flit, Setuptools 62+, and everyone else.

I'd recommend using pre-commit for type checking the code with mypy. Then you just don't care what version(s) of mypy you support when you install refurb, because you aren't installing it in the pre-commit environment. (I'd recommend looking at https://scikit-hep.org/developer/style, that sort of mechanism is used on a lot of projects very successfully). Pre-commit pins the mypy version and can auto-update it, much like dependabot (but runnable manually or with pre-commit.ci). Other options are nox, tox, or hatch - basically you shouldn't have one "dev" environment, but only install what's needed for each task.

Don't think mypyc would complicate anything. You'd just be providing pre-compiled binaries, but you'd have the same requirements - it only compiles the Python parts of your code, you'd still be making calls to a mypy dependency, I think.

dosisod commented 2 years ago

To be honest, Poetry just generated the pyproject.toml for me, and I never really touched it. In fact, the ^0.981 line isn't even doing what I think it is doing, because Mypy has a weird 0.xxy versioning scheme, where xx is the minor, and y is the patch. In this case, the caret will just pull the latest version every time, if I am not mistaken.

Since you seem more familiar with Python packaging as a whole, what would be the recommended way to consolidate the requirements.txt/dev-requirements.txt/pyproject.toml files? If I move everything into the pyproject file, I might be excluding certain package managers, though IIRC, most modern package managers (pip, pipx, poetry) support installing from the pyproject file.

henryiii commented 2 years ago

The caret in Poetry does "semantic version capping" (as if that's actually a real thing in Python, which it's not, it's invented by Poetry). So ^0.123 translates to >=0.123,<0.124. It's "starting here up to next major version". Basically it's an exact pin in MyPy's case. If you had a non-zero version, then it's ^1.23 -> >=1.23,<2. Better to be explicit, and even better to avoid the cap in cases where it's not needed - python=^3.10 will not only add the silly <4 part of the cap, but it will force all poetry projects that depend on you to add the same Python cap, regardless of if they like it or not. I could never depend on your package if you had wanted me to use it as a library because I'd refuse to add a <4 cap to Python (well, if I used Poetry, which I also would not, even before they added a 5% CI failure chance to "deprecate" a feature...). Personally I use PDM (which is a drop in replacement), and I use hatching for the build backend.

So I'd either replace the hard pin with an obvious hard pin (==), or a narrow range >=,< in this case with mypy.

The package manager doesn't care what the the build backend tells it the requirements or the extras are. From PyPI, you aren't installing from either one (wheels don't contain pyproject.toml and shoudlnt'l contain requirements.txt), but instead the package manager is reading it from the wheel metadata.

The main reason to have a requirements*.txt would be if you need to install dependencies but not the package, which is pretty specialized, or if you need a non-Poetry "lock-file" (this would be requirements.in -> fully specified requirements.txt). In your case, I don't think it would affect anyone to remove them.

I don't remember if poetry exports or provides a way to export the dev requirements into a dev extra; I usually do that (and usually avoid mega dev environments anyway, especially in poetry, which can take forever to upgrade the lock if it gets large).

PS: Poetry also caps the dev requirements, which is ridiculous, you are manually updating a lock file anyway, so why keep "poetry upgrade" from upgrading? If something breaks you are supposed to not commit the changed lock file. :shrug:

quantumpacket commented 1 year ago

@dosisod 0.991 is out with a few bug fixes.

dosisod commented 1 year ago

Thank you @quantumpacket , I will go ahead and bump this sometime before the end of the day.