Attempt to use Ruff formatter on this project

benhoyt commented 8 months ago

I think it's time to use automatic code formatting. We could use Black, but because we want to use Ruff for linting, we should also probably use Ruff's code formatter. It fixes several minor consistency issues with Black, and also adds a single-quote knob (yay! :-).

We should keep in mind that this project has tried to switch to Black twice in the past:

Nov 2019 in PR #49. Comment against: https://github.com/canonical/operator/pull/49#pullrequestreview-320942544
Again in Sep 2021 in PR #621. Comment against: https://github.com/canonical/operator/pull/621#issuecomment-925578397

We should consider those comments seriously, and see if there are fixes / workarounds for those issues, either by increasing the line length, adding/removing trailing commas to guide the tool, or manually reformatting the few places where the tool makes things significantly worse.

That said, the pros of using an automatic code formatter is high.

benhoyt commented 6 months ago

We'd like to pair on this in person in Madrid: one of us can do a quick first pass, then we can go over style concerns together and try to nut it out in a morning.

benhoyt commented 4 months ago

A couple of thoughts after looking at @IronCore864's preview of model.py:

1) I quite like the more consistent function parameter style, so I'm fine with that one. 2) I think we should consider increasing the max line length from 99 to say 109 or 119 columns. I think this would help avoid wrapping function calls and error messages too much on lines that are already somewhat indented. For reference, on my big screen I fit two panes of code side by side, each with 110 columns. 3) The one thing that stood out as annoying was that it doesn't seem to use "cuddled braces". For example:

# Old: 3 lines, easy to read
self._data.update({
    self.relation.app: RelationDataContent(self.relation, self.relation.app, backend),
})

# New: 7 lines! harder to read
self._data.update(
    {
        self.relation.app: RelationDataContent(
            self.relation, self.relation.app, backend
        ),
    }
)

Maybe we'll just have to get over that. Or maybe we can rewrite the ones that expand crazily to avoid the crazy 7-line wrapping:

data = {self.relation.app: RelationDataContent(self.relation, self.relation.app, backend)}
self._data.update(data)

tonyandrewmeyer commented 4 months ago

My NZ$0.02:

Firstly some disclaimers: before Black existed, the style guide I used (and had my teams use) was very similar, I adopted Black quite early (partly because there were so few changes), and I've been using it for most of my code ever since, so I'm very accustomed to it (and therefore Ruff's black-equivalent style). So even though Stockholm Syndrome may not be a real thing, I might have it in this case :laughing:.

I don't really like the Name = TypedDict("Name", {} style of TypedDicts anyway, but I think I slightly prefer the way we have them at the moment with the name on the same line twice. If these are going to change anyway, what about using class Name(TypedDict): except for the few cases where that won't work with the names?
When an argument list is too long for one line, I do prefer ruff's approach of one-per-line rather than keeping the number of lines minimal like we do now.
It took me a while to get used to having the closing parenthesis/bracket/brace on a separate line (or separate with a return type), but I do like it now, and I find that it avoids some ugly cases where you have to do extra indenting to make things clear.
The examples in model.py where ruff reduces the number of lines all seem ok to me.
We've talked about this before, but I like being consistent with regards to ' and ". I value the consistency here more than the actual choice.
I agree with Ben about the cuddled braces. Does ruff force this, or will it leave them alone if manually cuddled?
I think this is a nice example of where the change improves readability:

# Old:
            stop: Tuple[str, ...] = tuple(s.name for s in self.get_services(
                *service_names).values() if s.is_running())
# New:
            stop: Tuple[str, ...] = tuple(
                s.name for s in self.get_services(*service_names).values() if s.is_running()
            )

Ellipsis on the same line is new to me. I'm unsure about this, but I think I slightly prefer the old way.
I like forcing trailing commas.
I've seen Black make this blunder too - I expect we'll need to carefully look for them.

-                f"key {key!r} is invalid: must be similar to 'key', 'some-key2', "
-                f"or 'some.key'")
+                f"key {key!r} is invalid: must be similar to 'key', 'some-key2', " f"or 'some.key'"
+            )

With regards to the comment about commas in one of the earlier attempts, I think that was either a bug or something Black changed in the style - I don't remember seeing it, and it doesn't happen now.
With regards to chunks of hand-crafted formatting, which I think is most common in tests, I'm not a huge fan of these in general, but I agree the auto-formatted version looks worse. However, I think this is rare enough that a few off/on pragma statements would be ok so that they can be kept.
Similarly, I don't really like aligned columns of inline comments, but if they really are needed in exceptional cases, I think a few off/on pragma is reasonable.
In terms of line length, I think we should be guided by research on readability - there has been decades of work on this. Code is admittedly a bit different from general text - monospaced, more whitespace - but there's research for code width too.

IronCore864 commented 4 months ago

iPhone v.s. Android: which one do you like? Many has a preference, but to me (and maybe more), they are the same: they do exactly the same thing, and they even look the same more and more nowadays. I only choose iPhone because I couldn't be bothered to spend hours deciding which Android phone to buy. That doesn't mean iPhone is better than Android. Many choose Android because of some reason but that doesn't mean Android is better than iPhone either. There is no "best", if there was, everybody would go for the best choice, and other choices wouldn't exist any more.

Where do you want to live the most on the earth? Tokyo? Shanghai? New York? The list goes on. Everybody has a preference but there is no "best city" to live. Same logic: if there was, everybody would be moving there, rendering all the other cities empty. It's all personal preferences and priorities.

This brings me to the discussion on black V ruff (might as well throw in autopep8). None is perfect, there is no "best" option. If there was, everybody would switch to the best, and the other options wouldn't exist. I do not have a strong preference regarding ruff V black. They both are fine. Autopep8 is Okay, too. No matter which you choose, there will be corner cases that make you doubt your choice.

That said, I still did a comparison between black and ruff and here are some examples where they differ:

Sample 1:

_AddressDict = TypedDict(
    "_AddressDict",
<<<<<<< ruff
    {
        "address": str,  # Juju < 2.9
        "value": str,  # Juju >= 2.9
        "cidr": str,
    },
=======
    {"address": str, "value": str, "cidr": str},  # Juju < 2.9  # Juju >= 2.9
>>>>>>> black
)

Here I prefer ruff.

Sample 2:

        self._relations = RelationMapping(
<<<<<<< ruff
            relations, self.unit, self._backend, self._cache, broken_relation_id=broken_relation_id
=======
            relations,
            self.unit,
            self._backend,
            self._cache,
            broken_relation_id=broken_relation_id,
>>>>>>> black
        )

Still ruff.

Sample 3:

    def __init__(
<<<<<<< ruff
        self, name: str, meta: "ops.charm.CharmMeta", backend: "_ModelBackend", cache: _ModelCache
=======
        self,
        name: str,
        meta: "ops.charm.CharmMeta",
        backend: "_ModelBackend",
        cache: _ModelCache,
>>>>>>> black
    ):

Black here since the line starts to become too long to be read efficiently.

As you can see, even for the same person, it's not easy to decide which is best. If this was Sophie's choice, that movie would be 5 hours long instead of just 2h30m.

If I have to make a choice here, I choose ruff, not because of the style differences, but because ruff is written in Rust and that makes me think it's probably faster than black (which might not always hold true in real world).

I don't think we should make a decision based on personal preferences because by definition, personal preferences differ. How about a vote?

tonyandrewmeyer commented 4 months ago

Ah, sorry, I didn't mean to imply that we should choose between black and ruff. Ruff's formatter is more-or-less Black, and we should definitely use Ruff if we change, not consider using Black. I was just meaning to provide context in that I have been using the "Black style" for a long time, so am probably biased because of that.

The choice here is really between autopep8 and isort (the status quo) and ruff.

I don't think we should make a decision based on personal preferences because by definition, personal preferences differ. How about a vote?

I don't think we need to vote, we can just talk it over in person and come to a consensus.

tonyandrewmeyer commented 1 month ago

I believe this is complete and we just missed closing the issue, likely because there were multiple PRs.

canonical / operator

Attempt to use Ruff formatter on this project #1103