python-discord / bot

The community bot for the Python Discord community
https://pythondiscord.com
MIT License
1.37k stars 676 forks source link

Add mypy to the codebase #1773

Open ChrisLovering opened 3 years ago

ChrisLovering commented 3 years ago

I'd like to propose that we add mypy to the codebase and add it to our linting workflow.

The initial PR for this would be a large one, fixing the legacy issues, but from then on it'll be picked up before a PR can even be merged.

There are stubs that we can add as dev dependencies for discord.py and any other deps that we find lacking in the type hinting department.

This issue is mostly for planning for now, as I'd like to get more opinions on this change.

MarkKoz commented 3 years ago

I've never given mypy a proper shot before. The one time I tried it I got a bad impression because I was immediately trying to do something it cannot support (defining a JSON type, which is recursive). It's left me with the impression that it may be a burden trying to appease mypy. Anyway, I am open to trying it.

My concern is the type stubs. Do all packages we use have type stubs? Are they all well maintained and up to date? What recourse do we have if something is lacking type stubs or falls out of date?

TizzySaurus commented 3 years ago

My concern is the type stubs. Do all packages we use have type stubs? Are they all well maintained and up to date? What recourse do we have if something is lacking type stubs or falls out of date?

To install all the known stubs we can do mypy --install-types. I can't remember from my testing how many this installs but it does install a few.

As for libraries without stubs, we can just not check those libraries by passing --disable-error-code import. We probably also want to ignore attribute errors since we have custom attributes like Logger.trace, which can be done with --disable-error-code attr-defined.

We'd also need to use the --namespace-packages flag so that p.e. discord.ext and discord.ext.commands are supported under the discord stub.

TLDR; The command I use for mypy on the bot currently is mypy bot --disable-error-code attr-defined --disable-error-code import --namespace-packages. I uploaded the output of this in #dev-contrib. The command can of course be further adjusted to our needs.

decorator-factory commented 3 years ago

What are the benefits of adding mypy to the CI?

MarkKoz commented 3 years ago

What are the benefits of adding mypy to the CI?

To ensure the code conforms to mypy. We can encourage contributors all we want, but ultimately we have no way to enforce that they run mypy before making commits. However, we do have control over CI, so that's where we can enforce it.

decorator-factory commented 3 years ago

To ensure the code conforms to mypy

Of course, that's what adding mypy to the CI does. But why do we want the code to strictly conform to mypy? That was my question

MarkKoz commented 3 years ago

To ensure the code conforms to mypy

Of course, that's what adding mypy to the CI does. But why do we want the code to strictly conform to mypy? That was my question

How do you use mypy? Is there some middle ground between not using it at all and enforcing it in CI? Or are you instead asking why we should be using mypy in any capacity?

decorator-factory commented 3 years ago

Or are you instead asking why we should be using mypy in any capacity?

Yes, that's what I'm asking. As you mentioned, mypy isn't perfect, and there are issues like missing/outdated/poor type stubs, which could be an obstacle. Therefore I'm asking what benefits it would bring so we can evaluate the pros vs cons.

decorator-factory commented 3 years ago

Perhaps we should also evaluate alternative type checkers, like pyright or pyre? Their main disadvantage is that they don't support plugins, but they have some features that mypy doesn't (for example, pyright supports recursive type aliases).

MarkKoz commented 3 years ago

An advantage is that it provides type safety, which can catch more errors. A disadvantage is that we cannot be lazy with the type annotations (e.g. the callables for decorators).

TizzySaurus commented 3 years ago

I've actually been briefly looking over some mypy and there's a lot of false positives. Perhaps some of these could be brought down further by using the correct command line arguments but it's certainly annoying as-is.

As an example, we get error: Item "None" of "Optional[Role]" has no attribute "name" for the code

        for role_id in role_ids:
            if (role := guild.get_role(role_id)) is not None:
                members += len(role.members)
            else:
                raise NonExistentRoleError(role_id)
        return {name or role.name.title(): members}

The error comes from the role.name.title() on the last line but that will never get executed unless the role isn't None, and so has the attribute (the if/else controls this)

The only way I can find to "fix" this is to just ignore all union-attr errors, which then means potentially missing actual issues and sorta defeats the entire purpose of adding mypy.

EDIT: I've just been informed that typing.cast exists, so in this instance we can do

return {name or cast(Role, role).name.title(): members}
wookie184 commented 2 years ago

I've started having a look into whether this would be an improvement. So far it's seeming pretty good IMO. Some things I've looked into:

Custom converters

Custom converters don't represent the actual type passed to the function

async def my_command(self, ctx: Context, argument: URLConverter) -> None:
    reveal_type(argument)  # URLConverter. uh oh

We can fix these using typing.Annotated

async def my_command(self, ctx: Context, argument: Annotated[str, URLConverter]) -> None:
    reveal_type(argument)  # str

From a quick look it seems discord.py should support this. Defining the annotated versions in converters.py makes this is pretty neat.

Getting channels/roles etc from constants

channel = self.bot.get_channel(Channels.off_topic_0)
channel.send("HI")

get_xxx methods can return None. Generally we just ignore these when working with constants because we expect they'll be there, but a type checker wouldn't like that.

I think this is a good thing! Especially in dev it's quite possible that these things are undefined, and the error from that can be quite unclear, so I think it would make sense to change this even if we don't decide to use a type checker.

One solution would be to create our own functions that instead error if the channel is None. This would be pretty simple to do. It may be worth doing this in the cog_load, which would allow us to ensure a cog isn't loaded if the necessary constants aren't there.

Different types of Messageable/Channel

This is one thing I noticed where I think a type checker would be a big improvement. Discord has added a lot of new channel types recently (threads, forum channels, voice chat), and currently it's really difficult to work out what code needs to be updated to handle those (e.g. the silence command). A type checker would help us identify ensure we are covering all cases properly.

Site API and other external APIs

Currently we just assume responses fit the schema we're expecting to recieve. If we're happy with this assumption we can just type those responses as Any and we're no worse off than where we started. If we're willing to add in a bit more code to get proper validation and type hinting, pydantic would be a nice addition.

Other libraries

The main library we rely on is discord.py, which seems to be type hinted pretty well.

https://github.com/python-discord/bot-core/issues/134 would be helpful, as currently bot.site_api is something that can be None that we ignore.


Should we try and add mypy support?

Personally I think having mypy support would be pretty helpful. There are quite a few cases where bugs that have happened would have been caught by a type checker (https://github.com/python-discord/bot/pull/2274, https://github.com/python-discord/bot/pull/2273, https://github.com/python-discord/bot/pull/2260 are recent examples).

I wouldn't suggest trying to make a massive PR adding mypy support everywhere at once, that would be a massive task, there's no issue with starting by just enabling it file by file.

When I get a chance I'll try modifying a cog or two to pass mypy's checks. That should help us get a proper idea of whether this will be more invasive or helpful.

decorator-factory commented 2 years ago

Am I correct that Annotated in this case isn't actually "type safe"? As in, I can just do Annotated[int, URLConverter] Or is that going to be mostly solved with defining type aliases for Annotated in one place?

wookie184 commented 2 years ago

Am I correct that Annotated in this case isn't actually "type safe"? As in, I can just do Annotated[int, URLConverter] Or is that going to be mostly solved with defining type aliases for Annotated in one place?

Yeah (to both things)

I just realised we currently use the hacky solution of redefining the converters as the types they convert to under a TYPE_CHECKING condition. Pyright doesn't seem to like that (keeps using the original class type), not sure about mypy.

MarkKoz commented 2 years ago

Is there any accepted workaround for when annotations are used for something other than annotating types (such as for us with converters)? Or do type checkers work on the assumption that annotations always correspond to types?

wookie184 commented 2 years ago

Is there any accepted workaround for when annotations are used for something other than annotating types (such as for us with converters)? Or do type checkers work on the assumption that annotations always correspond to types?

That's what typing.Annotated is for.

There is also the typing.no_type_check decorator, but that would ignore all annotations on the function. (in practice, it seems mypy ignores the whole function, contrary to the typing docs? Pyright also doesn't seem to support it at all.)