datamade / django-councilmatic

:heartpulse: Django app providing core functions for *.councilmatic.org
http://councilmatic.org
MIT License
26 stars 16 forks source link

Take advatange of prefetching objects from the database where appropriate #180

Open evz opened 6 years ago

evz commented 6 years ago

In the case where we have a view that requires loading a lot of bills, we should take advantage of using Django's prefetch_related and select_related methods to limit the number of queries. Since Django lazily evaluates querysets, this can mean that anytime we are accessing related objects in a template, under the hood Django is executing a one off query to fetch that object from the database. This can very quickly multiply to mean that hundreds or thousands of queries are being fired anytime someone loads a page.

To make matters worse, we are also accessing related items via model methods which means that, even if we use prefetch_related or select_related in the view code, Django will ignore those objects that it's already fetched from the database and fire off more queries. A good example of this is the current_action property on the Bill model which is not only used on its own but also leveraged by several other model methods (and the methods that use those methods) on the Bill model and elsewhere (including in template code).

A solution to this problem that I just stumbled across today is to check if Django has already cached those objects on the database connection and, if so, use them instead of making another round trip to the database. So, a way of rewriting the current_action property would be like so:

# When executing a query like so:
bills = Bill.object.all().prefetch_related('actions')

# Django will cache the related objects which you can access like so within a model method

class Bill(models.Model):
    ... fields ...

    @property
    def current_action(self):
        if hasattr(self, '_prefetched_objects_cache') and 'actions' in self._prefetched_objects_cache:
            # Return the most recent cached action object
            return sorted(self._prefetched_objects_cache['actions'],
                          key=lambda x: x.order,
                          reverse=True)[0]

        else:
            # If the view code did not use "prefetch_related" for the action objects, fall back to the current way of doing things.
            return self.actions.all().order_by('-order').first() if self.actions.all() else None

Obviously, this is just an example of where we can make some improvements on those model methods. There are probably a lot more that can be improved in a similar way.

reginafcompton commented 6 years ago

I found a potential place to prefetch actions.

In the legislation_item partial we iterate over bill topics and call the tags partial, which uses get_last_action_date: https://github.com/datamade/django-councilmatic/blob/master/councilmatic_core/models.py#L350

This nested business means that we duplicate the same query (sometimes) dozens of times. I'd like to see if we could re-think this logic and prefetch actions somewhere.