virt-manager / virt-manager

Desktop tool for managing virtual machines via libvirt
https://virt-manager.org
GNU General Public License v2.0
2.35k stars 439 forks source link

RFE: refactor the code base with black #722

Open ptoscano opened 1 week ago

ptoscano commented 1 week ago

Python relies a lot on indentation, and PEP 8 [1] already provides guidelines of a basic consistent code style; that said, there are many ways to still format a Python code base.

Since the last few years, a new PSF-backed project has been providing a consistent way to format Python code, using a tool called black [2]. black is a very opinionated formatter, and it does not allow for many tweaks (mostly line length, whether to normalize strings). In addition to that, black also offers a way to check whether the formatting is correct, which makes it a good fit also in CI jobs.

virt-manager does not have a consistent style; this is seen from time to time in contributions, as there is confusion on how to format bits. Hence this RFE: reformat the code base using black. The only needed steps would be to decide on a line length (the default of black is 88), and whether normalize strings (the default is yes); in case any of the two would be needed, a simple pyproject.toml can be added with the config.

Formatting the code with black has a various benefits:

  1. a unified code style across virt-manager; no more differences even within the same file
  2. a simple way to format the code base: run black on the whole source directory, on specific files, etc, and you are good to go
  3. no more need to think about the formatting when doing changes, simply run black before committing (it can be also automated as local pre-commit hook)
    • this also simplifies the contributions in PRs
  4. a way to check that the style is correct: black --check does everything needed
    • this can be integrated as CI job
    • the psf/black@stable github action greatly helps in this
  5. the unified style will basically get rid of all the flake8 issues related to formatting
    • because of the black formatting, E203 and W503 may be reported by flake8, however it is easy to ignore them

There are also few downsides:

  1. there would be a giant commit with lots of formatting changes
  2. backports (usually done by distros) to versions released before the reformatting would be harder, as most likely they will require manual rewrites
    • this would be less of a problem over time
  3. the style sometimes it is not pretty to see, for example the famous "sad face":
    if (
       # long condition
    ):
       # some code

    while not nice to see (especially if you are not used to it), in the end it's one of the results of the consistent placement of opening & closing brackets that black manages; also, from personal experience, the style tends to be formatted "ugly" when it is already complex enough, for example lots of nested function calls, and usually becomes nicer when simplified/split a bit

Also because of the downside (2) above, my proposal is to reformat before the new release: this way, backports of changes after the release to the release would not get into code style issues.

In case this is wanted, I can help in all the steps needed:

@crobinso, @phrdina, opinions on this?

[1] https://peps.python.org/pep-0008/ [2] https://github.com/psf/black

crobinso commented 6 days ago

Thanks for the write up and the offer to help! I'm all on board. I'm sure aspects of the style will make my eyes bleed at first but I'll adjust, and I've been looking for an excuse to learn modern workflow tools.

I'm neutral on timing. I talked with @phrdina a bit and he had some opinions so I'll let him chime in. If we can speed up release cadence then maybe pre or post release timing doesn't matter much. If we release this month, then following release isn't going to be for a year+, we should definitely format before upcoming release :)

charlesa commented 6 days ago

I am inclined to think that making this type of change should come after the new release. I would suggest that the tree get branched right after the new release based on the original unformatted code. Then backporting a feature or fix to an older code base based on patches from that new branch (based on old formatting) will require less work. Those who carry distro specific downstream patches will have less work with an unformatted new release. Then after the release, those with downstream patches can take whatever time they need to refactor those downstream patches to apply to the reformatted code. One concern I have is what subtle formatting bugs will be introduced with these changes? How long will this reformatting take to the point where we have confidence in a stable release? It has already been two years since the last release.

ptoscano commented 6 days ago

I would suggest that the tree get branched right after the new release based on the original unformatted code. Then backporting a feature or fix to an older code base based on patches from that new branch (based on old formatting) will require less work.

virt-manager never had any stable branch: this is also why I proposed to reformat before the release.

One concern I have is what subtle formatting bugs will be introduced with these changes?

Good question, which gives me the opportunity to mention something more on black. black preserves the AST of the sources, and it will fail in case any of its changes would generate a different AST. It has AST preservation so much at its core, that you will see changes such as

 def update_changes(domain, devs, action, confirm):
     if action == "hotplug":
-        msg_confirm = _("%(xml)s\n\nHotplug this device to the guest "
-                        "'%(domain)s'?")
+        msg_confirm = _("%(xml)s\n\nHotplug this device to the guest " "'%(domain)s'?")
         msg_success = _("Device hotplug successful.")
         msg_fail = _("Error attempting device hotplug: %(error)s")

ie in the _() function call of msg_confirm there are two strings as parsed in the AST (they become one when executing the code), and black preserves them as they are.

So according to the experience that I have in blackening (reformatting with black), I can tell you that the answer is: no bugs are introduced.

How long will this reformatting take to the point where we have confidence in a stable release?

blackening is a very quick process: once the couple of details I mentioned above are settled (mostly the line length), then:

At that point, the whole process is done; an additional thing would be the CI job to enforce the style, which is easy to do. What might delay is that all the existing PRs (except the Weblate one) will require updates.

phrdina commented 6 days ago

Introducing bugs by doing the reformat is my main concern as well. Usually I would postpones such big change after making release, but on the other hand if you have a good experience with it we should be most likely fine doing it before release.

Now for the things we need to decide. The default 88 line length feels strange to me personally. I would go with 100 characters. And I would keep normalizing strings.