pylint-dev / astroid

A common base representation of python source code for pylint and other projects
https://pylint.readthedocs.io/projects/astroid/en/latest/
GNU Lesser General Public License v2.1
529 stars 275 forks source link

Django Manager.from_queryset causing significant performance issues #532

Closed sdcooke closed 1 year ago

sdcooke commented 6 years ago

Changing astroid version from 1.4.9 to 1.5.0 our pylint run time significantly increased. We managed to establish that the bulk of the time was spent inferring the use of Manager.from_queryset in our Django code - e.g.

class MyQuerySet(QuerySet):
    pass

class MyManager(Manager.from_queryset(MyQuerySet)):
    pass

My knowledge of astroid is limited so I don't know exactly what it's doing but we managed to write a hacky pylint plugin to workaround the issue:

from astroid import MANAGER, nodes, inference_tip, Uninferable

def predicate(call):
    return getattr(call.func, 'attrname', '') == 'from_queryset'

def transform(self, context=None):
    return iter([Uninferable])

def register(linter):
    MANAGER.register_transform(nodes.Call, inference_tip(transform), predicate)

The run time went from ~570 seconds to ~160 seconds.

I'm pretty sure astroid wasn't managing to correctly infer the method anyway so I don't think this actually changes anything.

The workaround is good enough for now but I thought I'd post the details here in case someone else runs into the same issue we had. I can't provide our codebase as an example but this is the source of the code in Django https://github.com/django/django/blob/cf8fc4797458b2c788ecf0be0afca6b0512ce1c0/django/db/models/manager.py#L165.

PCManticore commented 6 years ago

Hi @sdcooke and thanks for creating this issue!

This performance issue sounds bad, we'll try to see what's happening. We recently started looking more into the performance of astroid & pylint, especially with #497 and #519, there is also a performance dataset that we're looking in integrating with pylint, the general idea being that pylint is kind of slow and we're looking into ways of improving this aspect. Regarding the workaround, we're usually not adding inference tips that are prohibiting the inference, e.g we always have to try to infer something, even if by default this might be prohibitive from the point of view of the performance. But your transform might actually be something that could be integrated in pylint-django, you should ping them as well, as they might be interested in providing a fix for from_queryset.

sdcooke commented 6 years ago

I agree the workaround is definitely not ideal - if I weren't so out of my depth I would have tried to return the correct thing! Having said that I'm pretty sure that without the transform it wasn't managing to infer anything anyway.

Thank you - I've opened an issue on pylint-django as well (https://github.com/PyCQA/pylint-django/issues/147)

twidi commented 5 years ago

@sdcooke Thanks to your workaround, the time passed on our CI for pylint just passed from 13 minutes (with 16 processes) to less than 2

jacobtylerwalls commented 1 year ago

Thanks for identifying the issue and posting the workaround. We've been merging some performance improvements to astroid. I tried to reproduce this issue the main branch and couldn't identify anything out of the ordinary. Feel free to reopen if you can reproduce the same performance differential on a recent version (or on main). I'd be glad to help distill a reproducer at that point and continue the investigation from there. Thanks.