Open jtran opened 1 year ago
It's important to note that currently, all the attributes on the objects we generate from API responses are dynamically generated. For example, a Repository
object will have attributes like name
generated from the response. I assume this was done initially to make py-gitea more flexible, so that there could be changes to the fields in gitea without breaking py-gitea. However, this both makes it hard to type and hard for us to version, so I think we should consider adding those attributes to the classes with types.
I spent some time evaluating how to do this, and I thought I'd checkpoint with what I've done so far.
To recap, the main reason we can't type check the codebase right now is that all of the classes in apiobject.py
have attributes dynamically injected at runtime from the JSON response returned by AllSpice Hub. To pass mypy, we need to add these attributes.
The ideal solution here would be something that doesn't require too much manual effort to create and maintain and can be automatically checked with new versions of AllSpice Hub to catch regressions.
What I've evaluated:
Doing it manually: Obviously, we can go class by class and add the attributes to each. The main problem with this is that it requires a lot of effort. We have a lot of classes, and they have a lot of attributes. Some of those attributes are of one-off types, like Branch.commit
being a PayloadCommit
instead of a Commit
. This gets worse because a PayloadCommit.user
is a PayloadUser
😰. The workaround here would be to use dict
s for these until they become important enough to deserve their own classes, or try to convert them to other types via _fields_to_parsers
.
The other problem with this approach is that it isn't simple to test against the current version of the API. We can typecheck it, but there isn't a clean way for us to check whether the types we're getting from the API are the same as the types we've written. This means that if there are regressions we might not know until they're spotted in the wild.
Generating stubs with the OpenAPI spec: This was my first attempt at doing this automatically. This would unlock automated testing, as you can compare the types you have checked in vs the types you are getting from the API. The downsides are that this is still susceptible to the one-off types issue (and can be worked around the same way) AND doesn't know how to deal with the _fields_to_parsers
. So I had to manually change the types it came up with for both of these cases, which again makes it difficult to automatically test. Ultimately I don't think this is worth the effort.
Using a runtime stub generator like Monkeytype: MonkeyType runs your test suit and uses it to generate stubs based on the real types of your objects. This is wonderful! Except it can't generate stubs for attributes. Their reasoning for this is that attributes can change any time, and python doesn't give you hooks to watch these changes. I think that's true in general, but in py-allspice we can be reasonably sure that attributes change only once, when the object is parsed from the response. That is why I think patching MonkeyType to stub attributes would be the best solution here:
But this approach still isn't without its problems. For one, we have to patch MonkeyType to do this, which is bound to be at least a little complicated. Then, we have to figure out a way to only use the attribute types from MonkeyType, since in my testing the generated types for functions weren't as good as our handwritten types.
Right now, I think the choice is between investing in Option 3 or going class by class on Option 1. MonkeType is a good choice if we want to automate testing the types against the live API, and hand-writing types is fine if we are okay with the risks of not testing them automatically.
I like the idea of using MonkeyType; I used this to generate types in some of the more gnarly modules in Tools. I was actually thinking about another approach that could augment number 3. I.e. maybe we can do some kind of hybrid where we use MonkeyType for non-attributes, but we additionally use:
null
, that won't tell us what type the fields are -- but hopefully we have this already. And if we don't, it would be worthwhile to invest in tests that do that.That's a really interesting idea. I thought about it and I think it might be the simplest automated solution: we hook into parse_response
using sys.setprofile
like monkeytype does, and since the returned object is the API object, we can look at its attributes and their types. We can then use libcst (again, like monkeytype) to apply these changes. I'll attempt the first half of this first to see if the results are useful.
Most of the remaining typecheck issues fall under three broad categories:
E.g:
ReleaseAsset
: https://github.com/AllSpiceIO/py-allspice/blob/322b2db3bfaa9a02d839aa176350af0975f60c19/allspice/apiobject.py#L2229This one is simple enough. The root cause here is probably that in some test somewhere the field is not set or set to a bad type. To fix this, we'll have to find which codepath is responsible for that type and find out why it is doing so. If it turns out that that field is legitimately optional or has that alternate type, we have to add guards throughout the rest of the code.
This is pervasive throughout the code. For example, BaseApiObject
defines an implementation of request
that doesn't even work:
Which is then overridden in incompatible ways by subclasses:
(The name
argument here is the source of the incompatibility.)
There's two ways to fix this, and unfortunately they're both breaking changes:
*args, **kwargs
) to the base implementation and make every inheriting object also take catch all arguments. IMO this is a bad solution!E.g.:
Release
: https://github.com/AllSpiceIO/py-allspice/blob/322b2db3bfaa9a02d839aa176350af0975f60c19/allspice/apiobject.py#L1828This is more complicated because technically this isn't a problem with the typegen script. The root cause here is that for classes like Issue
, we want to have a pointer to a Repository
within them, but the response doesn't have the Repository
. So we need to inject a repository from somewhere into the issue, which we usually do after parse_response
. Now the typegen script doesn't know that the type is changing after parse_response
, which leads to this error.
Now, the problem here is that this also ties into the first and second class of problems because of how we have to solve it. The simplest solution is to add a request
call in the parse_response
method of the class. For example, Issue.parse_response
would then become:
@classmethod
def parse_response(cls, allspice_client: AllSpice, result: Mapping) -> Issue:
api_object = cls(allspice_client)
cls._initialize(allspice_client, api_object, result)
repository = Repository.request(...)
return api_object
This is bad, because this means that every issue will make a network issue to fetch its associated request. So if you run repo.get_issues
, you will make an N+1 call!
Two immediate solutions come to mind for this:
parse_response
to take the repository as an argument: Unfortunately, that makes this a class 2 problem, i.e. an incompatible override. I think the best solution here is to add a **kwargs
final argument to the base class parse_response
method, and when an API Object needs more context in parse_response
, it checks the kwargs for them, failing if it doesn't find it. To illustrate:
@classmethod
def parse_response(cls, allspice_client: AllSpice, result: Mapping, **kwargs) -> Issue:
api_object = cls(allspice_client)
cls._initialize(allspice_client, api_object, result)
if "repository" not in kwargs:
raise ValueError("Issue.parse_response needs a repository kwarg.")
repository = kwargs["repository"]
return api_object
This isn't the best, but I can't think of a better solution that also doesn't have its own downsides.
Right now, there are many type annotations. But as far as I can tell, none of them are used. Running
mypy
yields hundreds of errors.I'm sure many errors will go away from tiny fixes that are used a bunch of places, like this:
https://github.com/AllSpiceIO/py-allspice/blob/aa26235f262fd014d5df6c18f614d864d726bd66/allspice/apiobject.py#L833
This issue is to get it passing. Then we'll make a separate issue to add it to CI so that we stay passing.