Hygienic mypy plugin: don't import Django into the mypy process

typeddjango / django-stubs

PEP-484 stubs for Django

MIT License

1.51k stars 425 forks source link

Hygienic mypy plugin: don't import Django into the mypy process #1384

Open intgr opened 1 year ago

intgr commented 1 year ago

We have no immediate plans to implement this, but I decided to open this issue for discussions, to split it out from #1337.

Original post by @intgr https://github.com/typeddjango/django-stubs/issues/1337#issuecomment-1401668445

The mypy plugin should be "hygienic" with regards to Django code being checked. It should not directly import project code or Django into the mypy runtime, so it doesn't get entangled with the code it's checking. Instead it could for example invoke subprocesses to discover Django internals.

The current implementation is problematic for many reasons. For instance mypy shouldn't crash with traceback when there's a syntax error in Django settings. And when using mypy daemon dmypy, it should be capable of re-discovering changed models files, instead of using obsolete state or randomly crashing, as it does now.

intgr commented 1 year ago

Comment by @adamchainz https://github.com/typeddjango/django-stubs/issues/1337#issuecomment-1414395472

Probably a bit of a brain twister but would be nice to see. I think it would also make it a bit easier to separate out what needs discovering and open the door for other type checkers.

intgr commented 1 year ago

Original comment by @sobolevn https://github.com/typeddjango/django-stubs/issues/1337#issuecomment-1449789551

Since we discussing a lot of things in this issue, I will add my 2c about "Same with the hygienic plugin redesign".

I think it is a really bad idea. When we were originally designing the current state of the plugin, we already had static-analysis-only previous version of the plugin. It was a nightmare.

Since django has so many magic things happening in the model definition, it is almost impossible to get right.

And running django during type checking is not a bad thing. For example, django-check do this as well. So, this is rather standard.

So, I highly recommend not to change this.

intgr commented 1 year ago

Original comment by @flaeppe https://github.com/typeddjango/django-stubs/issues/1337#issuecomment-1449863734

I've touched the plugin code a bit and I'd would like to agree with both of your opinions regarding the hygienic plugin redesign. I've seen a bit of "shortcutting", in regards to choosing to use Django (runtime) internals to resolve types in some ends. What I mean is that it would/could be possible to use mypy's internals to figure out correct types, but Django's has been used instead, probably because it's more straight forward and simple. It could very well be possible what mypy offers is/was (currently) insufficient too.

With that said, I have a very hard time seeing that it will be possible to get rid of the dependency on Django's runtime completely, but I'm quite certain that some parts could (and perhaps should) stop using Django to compute types and start using some "mypy approach" instead(or custom, for that matter).

Two other "hygienic" parts on the top of my head are:

Report coverage for the plugin's Python code (I'm suspecting quite a bit of dead code)

Have test case support for cached mypy runs (there are a lot of bugs that has been resolved but we have no regression tests for, due to the limitation)

I think both of the above might be fixes for/in https://github.com/TypedDjango/pytest-mypy-plugins

intgr commented 1 year ago

I think it is a really bad idea. When we were originally designing the current state of the plugin, we already had static-analysis-only previous version of the plugin. It was a nightmare.

Since django has so many magic things happening in the model definition, it is almost impossible to get right.

I suspected this as well, reimplementing all of Django's logic in our plugin may be too big and messy to maintain. But the current situation is messy as well, does anyone really disagree?

My thinking was to implement something in between the two approaches: the plugin would work off a cache of model metadata. When our mypy plugin needs to update the cache of Django model metadata, it would invoke a subprocess that may import Django, discover all the internals it needs, and returns it in some format. When this subprocess crashes, we can emit some warnings and continue with the old cache, instead of failing entirely.

This would solve the main issues I see with the current design:

Slowness -- Django is a behemoth to boot up (e.g. #1341). I'm assuming we could instead use cached Django metadata during most invocations.
We don't support mypy daemon properly -- when models change, there's no way to discover new changes when the files were already imported.
Crashes -- it's really ugly that mypy crashes with a traceback when there's an error in project code that Django imports (e.g. models).

But I might just be underestimating how difficult this cache discovery or management would be.

federicobond commented 1 year ago

One approach that can be helpful here is to begin by restricting direct manipulation of Django objects to the DjangoContext class. All other users inside the plugin must deal with lightweight objects returned by DjangoContext methods that serve as representations of the underlying Django model metadata.

This way it's possible to map the extent of the information that we would have to obtain and cache from Django without committing to a particular implementation. Once the rest of the plugin is using these lightweight metadata objects it's then possible to construct them inside DjangoContext from information gathered by a separate isolated process, for example.

ljodal commented 10 months ago

I'd be very interested to work on this. We have multiple large Django projects at work and the startup times of our projects makes it quite painful to work with mypy, especially when you want like feedback like through an editor integration. It's also quite annoying that you'll loose all feedback while editing if there's an error anywhere in the codebase. Normally mypy would be happy processing the rest of the files.

Does anyone have an overview over what features currently rely on the runtime? I'm most familiar with the ORM parts of the plugin and know that it's used extensively there, but I also feel like it should be possible to extract the required information statically. Yes we end up implementing some Django logic, but I don't think it'd be that bad. The one thing I'm unsure about is wether it'd be possible to "backfill" previous models with related managers as we discover them or if we need some kind of global model state available before we can fully count the models as processed.

I like the idea of running a separate process to load the Django state, however that'd still have all the downsides of loading up a large and slow codebase. And if we want it to be performant it has to be cached, which means dealing with cache invalidation. I suspect the effort required to get a good caching strategy to work here would quite high.

flaeppe commented 10 months ago

[...] Does anyone have an overview over what features currently rely on the runtime? [...]

A lot of pieces relying on runtime related to models can partly be found here (triggered by mypy's early on get_base_class_hook):

https://github.com/typeddjango/django-stubs/blob/0117348c3c7713f25f96b46e53ebdeed7bdba544/mypy_django_plugin/transformers/models.py#L590-L601

Although I think the general case is that usage of the runtime is quite scattered over the whole plugin. I agree with others that there's probably a better abstraction layer to be had, to sort things out.

I like the idea of running a separate process to load the Django state, however that'd still have all the downsides of loading up a large and slow codebase. And if we want it to be performant it has to be cached, which means dealing with cache invalidation. I suspect the effort required to get a good caching strategy to work here would quite high.

What about starting to get better compatibility going with the Mypy daemon? That should probably speed things up, if possible.

ljodal commented 10 months ago

Hmm, okay, so looking around the code is seems like the django runtime is used for basically two things: 1) models and related metadata and 2) settings. I believe it should be possible to solve both of those cases without the runtime, but it would probably have some limitations related to things that are very dynamic (ie. managers defined with a dynamic super class and other things that are not easily resolved using using static analysis). For those cases it should be possible to add some manual type hints to help the plugin understand that a manager exists, similar to how you'd add hints if you're trying to type a django project without the plugin.

I like the idea of running a separate process to load the Django state, however that'd still have all the downsides of loading up a large and slow codebase. And if we want it to be performant it has to be cached, which means dealing with cache invalidation. I suspect the effort required to get a good caching strategy to work here would quite high.

What about starting to get better compatibility going with the Mypy daemon? That should probably speed things up, if possible.

Yeah, that would definitively be a nice improvement, though I'm completely unfamiliar with the daemon, so I wouldn't know where to start. I tried setting up dmypy though pylsp just the other day and couldn't get it to work. Not entirely sure what the problem was, but I suspect is was somehow related to the Django plugin

intgr commented 10 months ago

if we want it to be performant it has to be cached, which means dealing with cache invalidation. I suspect the effort required to get a good caching strategy to work here would quite high.

Mypy already caches types in parsed files and handles cache invalidation (.mypy_cache). It also provides some cache hooks for plugins. It might be possible to piggyback on that mechanism to know when we need to re-analyze models and/or settings.

What about starting to get better compatibility going with the Mypy daemon?

FWIW I've tried dmypy on multiple occasions to automatically typecheck files after saving, and found it to be quite unreliable. When invoking dmypy run in sequence to check various different files, it has caused crashes, hangs, false positive errors and false negative errors (compared to mypy without daemon). I haven't reported them as they are difficult to isolate, and they may be caused by mypy_django_plugin. But I get the impression they are upstream dmypy issues.

Proper invalidation for discovered Django models would be helpful for dmypy usage anyway. With our current architecture, the user would have to manually restart dmypy every time they change models, to get accurate results.

ljodal commented 10 months ago

I was curious about what it would take to set up the models without the runtime information, so I've spent a few hours today playing around with setting up a plugin from scratch. My progress so far can be found here: https://github.com/ljodal/django-stubs-poc (at the time of writing on this commit)

The readme describes some of the complexity I've had to work around to get the dependencies between models set up. It's not ideal, since there's no way of telling what related managers exist just by looking at a single model definition. It is however possible to make this work good enough I think.

My idea is basically to build up some state with the information needed to add all the dynamic runtime attributes. Another approach to getting this is also to manually resolve all the model files and parse the files using the ast library. I expect that to be significantly faster than loading the full runtime, especially for larger projects. If we can define this state entirely without relying on Django's classes, we could even have swappable "backends" here, so you'd be able to choose which approach you want based on the project.

flaeppe commented 10 months ago

Nice experimentation!

[...] since there's no way of telling what related managers exist just by looking at a single model definition. [...]

I'm not sure I agree with this statement. I mean, by statically analysing a model and its MRO we should be able to find all relations the model declares, I hope? I'm thinking that as long as we always can figure out what other model the current model relates to. It should be possible to establish additional attributes (read managers) on the related model.

In addition to that, Django comes with a couple of tricks and interfaces to figure out where and which is the related model e.g. ForeignKey("<app_label>.<ModelName>", ...). But I don't see it as impossible to process all */models.py. Then check the names of those modules for TypeInfo that is a subclass of BaseModel etc. etc. Now I'm rambling, but what I mean is that we know Django expects "<app_label>.<ModelName>" to be globally unique, as such I'm thinking we should, in one way or another, be able to resolve it to a fullname.

What I'm thinking is that the plugin should assume all models are registered. It feels out of scope for it to attempt to figure out if the Django runtime considers it registered or not? As such, all statically resolvable "<app_label>.<ModelName>" would be considered OK, I suppose what becomes a new problem then is ambiguity. If there's seemingly 2 modules that resolves to the same app label and contains identical model names..

So yeah, it isn't trivial..

From your README:

[...] Another issue we need to resolve is app labels. [...]

We could definitely also process all apps.py to figure out unique app labels, perhaps even translate Djangos implementation into "mypy logic". I don't think we need to care about duplicate app labels within the same module. But it would/could be troublesome if two different modules conflicts in app label name. We don't know which is installed thus not which models.py is referenced..

[...] Another approach to getting this is also to manually resolve all the model files and parse the files using the ast library. [...]

Mypy has already parsed the ast and massaged it into its internal data structures, found in nodes.py, where a python file's(module) root object is called MypyFile. Couldn't we utilise those structures better?

In the end those are what we need to find to report back to mypy. So we might as well use them to begin with?

ljodal commented 10 months ago

Nice experimentation!

[...] since there's no way of telling what related managers exist just by looking at a single model definition. [...]

I'm not sure I agree with this statement. I mean, by statically analysing a model and its MRO we should be able to find all relations the model declares, I hope? I'm thinking that as long as we always can figure out what other model the current model relates to. It should be possible to establish additional attributes (read managers) on the related model.

Let me clarify with a example:

# customers/models.py

class Customer(models.Model):
    ...

# orders/models.py

class Order(models.Model):
    customer = models.ForeginKey("customers.Customer", related_name="orders")

Here there's no way of telling that Customer.orders exists without looking at the Order model.

In addition to that, Django comes with a couple of tricks and interfaces to figure out where and which is the related model e.g. ForeignKey("<app_label>.<ModelName>", ...). But I don't see it as impossible to process all */models.py. Then check the names of those modules for TypeInfo that is a subclass of BaseModel etc. etc. Now I'm rambling, but what I mean is that we know Django expects "<app_label>.<ModelName>" to be globally unique, as such I'm thinking we should, in one way or another, be able to resolve it to a fullname.

This is absolutely possible, but we also need to know the app label, which can be defined in a different file (apps.py).

What I'm thinking is that the plugin should assume all models are registered. It feels out of scope for it to attempt to figure out if the Django runtime considers it registered or not? As such, all statically resolvable "<app_label>.<ModelName>" would be considered OK, I suppose what becomes a new problem then is ambiguity. If there's seemingly 2 modules that resolves to the same app label and contains identical model names..

So yeah, it isn't trivial..

Yes, that's what I'm thinking as well, although depending on the dependency graph mypy might not load all models, so we need another step to ensure all models are loaded. Hence my processing of INSTALLED_APPS in the settings module.

From your README:

[...] Another issue we need to resolve is app labels. [...]

We could definitely also process all apps.py to figure out unique app labels, perhaps even translate Djangos implementation into "mypy logic". I don't think we need to care about duplicate app labels within the same module. But it would/could be troublesome if two different modules conflicts in app label name. We don't know which is installed thus not which models.py is referenced..

I don't think conflicts are allowed, so I assume that shouldn't be too much of a problem?

[...] Another approach to getting this is also to manually resolve all the model files and parse the files using the ast library. [...]

Mypy has already parsed the ast and massaged it into its internal data structures, found in nodes.py, where a python file's(module) root object is called MypyFile. Couldn't we utilise those structures better?

In the end those are what we need to find to report back to mypy. So we might as well use them to begin with?

Yeah, let me clarify a bit here: Yes, mypy does parse everything we need (at least if we tell it to), but as far as I can tell it doesn't make it available in any of the plugin interfaces in a way that makes it possible for us to use this. It seems like get_additional_deps is the first plugin interface that's guaranteed to be called. This is called with a single MypyFile instance and no additional context on the dependency graph. This is where we need to tell mypy which other files are needed, so from the example above we should ideally specify that customers.models depends on orders.models and vice versa. Unfortunately we have no control over the order files are processed in here, so depending on how you invoke mypy the first thing it gives you is customers/models.py, at which point we have no idea that orders/models.py exists.

So yes, mypy has all the state and context we need, but not made available to plugins in a way that lets us define these relationships easily. Hence all my trickery to declare the dependency hierarchy though the settings file.

flaeppe commented 10 months ago

Yeah, let me clarify a bit here: Yes, mypy does parse everything we need (at least if we tell it to), but as far as I can tell it doesn't make it available in any of the plugin interfaces in a way that makes it possible for us to use this. It seems like get_additional_deps is the first plugin interface that's guaranteed to be called. This is called with a single MypyFile instance and no additional context on the dependency graph. This is where we need to tell mypy which other files are needed, so from the example above we should ideally specify that customers.models depends on orders.models and vice versa. Unfortunately we have no control over the order files are processed in here, so depending on how you invoke mypy the first thing it gives you is customers/models.py, at which point we have no idea that orders/models.py exists.

So yes, mypy has all the state and context we need, but not made available to plugins in a way that lets us define these relationships easily. Hence all my trickery to declare the dependency hierarchy though the settings file.

Right, that's true, didn't think of that. The INSTALLED_APPS is then a form of global state that would be needed/implicitly loaded in order to be able to ensure all types have been collected.

Yes, that's what I'm thinking as well, although depending on the dependency graph mypy might not load all models, so we need another step to ensure all models are loaded. Hence my processing of INSTALLED_APPS in the settings module.

Could it be some sort of first "milestone" to refactor towards swapping out the apps.populate() call for

settings = import_module(xyz.django_settings_module)
installed_apps = settings.INSTALLED_APPS

To collect the context needed to find all registered models?

That would allow us to touch less runtime code at least, and should hopefully not be as slow. But we still fall under the prerequisite that we can resolve everything without the Django runtime.

I just want to say that I'm all open to get hold of changes that replaces usage of the runtime for solutions through Mypy. Hopefully we've collected quite a bunch of regular use cases in our suite that reveal requirements, once one starts fiddling.

ljodal commented 10 months ago

I think a very good first step would be to do a refactoring of how the runtime logic is used. Right now we directly access e.g. the app registry when transforming the models, which we're not going to be able to do if we don't fire up the django runtime. If we instead load the Django project and collect all the state we need into e.g. some dataclasses then we can later look into populating that state from reading the source rather than doing it at runtime. This would also allow us to make the "backend" for that state swappable, so it's possible to opt-in/out of loading the runtime as needed for each project.

I've also been wondering if just loading up the settings module would be a decent middle ground, but ideally I'd like to not do that either. Again, maybe that can be an option among multiple where you can choose how to get the Django state? There's going to be cases that work today that we cannot possibly support by just statically inspecting the code

ljodal commented 10 months ago

Another option is to ask for additional hooks from mypy, either a way to defer in the "get additional deps" callback (so we can wait until the settings module has been discovered) or a separate hook that allows us to first find the settings module before we process the models. Django does a lot of "try importing this module" (e.g. for apps.py and models.py), which we can't really replicate without checking the file system. Maybe we could get access to mypy's loader and ask it to load modules?

flaeppe commented 10 months ago

[...] some dataclasses then we can later look into populating that state from reading the source rather than doing it at runtime. [...]

We might perhaps want to keep json-compatibility in mind here. In case we note if it should/would go in to mypy metadata.

[...] This would also allow us to make the "backend" for that state swappable, so it's possible to opt-in/out of loading the runtime as needed for each project. [...]

I like the idea of a backend interface, allowing it to change quite independently or take different approaches. Not sure about promoting multiple backends though, but definitely some sort of a nice deprecation tool.

[...] I've also been wondering if just loading up the settings module would be a decent middle ground, but ideally I'd like to not do that either. Again, maybe that can be an option among multiple where you can choose how to get the Django state? [...]

We could definitely phase stuff out over time, but still do changes step by step. Trying to keep things reasonably backwards compatible

[...] There's going to be cases that work today that we cannot possibly support by just statically inspecting the code

Could we investigate? Anything you have from the top of your head?

Another option is to ask for additional hooks from mypy, either a way to defer in the "get additional deps" callback (so we can wait until the settings module has been discovered) or a separate hook that allows us to first find the settings module before we process the models. Django does a lot of "try importing this module" (e.g. for apps.py and models.py), which we can't really replicate without checking the file system. Maybe we could get access to mypy's loader and ask it to load modules?

This is definitely an option, but I think we should then define some sort clear approach as to why/what/how for what more exactly is needed. What I'm saying is that we should present something that is thought through and we know is needed and would work. So that we increase our chances of actually getting it and not just become a nuisance

ljodal commented 10 months ago

[...] some dataclasses then we can later look into populating that state from reading the source rather than doing it at runtime. [...]

We might perhaps want to keep json-compatibility in mind here. In case we note if it should/would go in to mypy metadata.

Yeah, but that can be done even with a dataclass. I like the approach taken by the dataclasses plugin in mypy, and I've also used that in other mypy plugin I've written: it has a dataclass that represents the metadata, with .serialize() and .deserialize() methods to convert to/from a pure python dict

[...] This would also allow us to make the "backend" for that state swappable, so it's possible to opt-in/out of loading the runtime as needed for each project. [...]

I like the idea of a backend interface, allowing it to change quite independently or take different approaches. Not sure about promoting multiple backends though, but definitely some sort of a nice deprecation tool.

I guess wether we want/need to support multiple backends long term depends on what limitations come from the no runtime information approach. And having it as opt-in during development would be nice, that way we don't need to do everything in one huge PR, but can gradually implement transformations etc

[...] I've also been wondering if just loading up the settings module would be a decent middle ground, but ideally I'd like to not do that either. Again, maybe that can be an option among multiple where you can choose how to get the Django state? [...]

We could definitely phase stuff out over time, but still do changes step by step. Trying to keep things reasonably backwards compatible

[...] There's going to be cases that work today that we cannot possibly support by just statically inspecting the code

Could we investigate? Anything you have from the top of your head?

I've seen stuff like this in a couple of places, for example in django-mptt. If I remember correctly mypy is unable to get the MRO of this class and thus also detecting that it's actually a manager. Might be some magic we can do here though, but I'm unsure. But my concern is that for these cases we'll not be able to detect some managers

class MyManager(models.Manager.from_queryset(MyQuerySet):
    ...

Another option is to ask for additional hooks from mypy, either a way to defer in the "get additional deps" callback (so we can wait until the settings module has been discovered) or a separate hook that allows us to first find the settings module before we process the models. Django does a lot of "try importing this module" (e.g. for apps.py and models.py), which we can't really replicate without checking the file system. Maybe we could get access to mypy's loader and ask it to load modules?

This is definitely an option, but I think we should then define some sort clear approach as to why/what/how for what more exactly is needed. What I'm saying is that we should present something that is thought through and we know is needed and would work. So that we increase our chances of actually getting it and not just become a nuisance

Yeah, exactly what we'd need here is a bit unclear to me, and we might be able to make it work as-is, just with a bit more hassle and a dependency graph that's not ideal (because we don't know about back references when declaring dependencies)

delfick commented 1 month ago

I imagine it could work if we spawn a separate process that loads the django state and communicate with that such that we get the required information rather than the actual django objects. And then when we detect changes to the settings we close that process and spawn another one (currently trying to reload django in the same process doesn't work cause there's too much global state and caching around imports).

I had a look and it seems it would be quite the effort to remove Django specific types from the plugin though. But as a first step the DjangoContext class could still be in process, but progressively changed to no longer expose any django specific types.

I believe this is necessary to make it so that we can make dmypy see changes without restarting it every time a setting/model/installed_app is changed.