DemocracyClub / yournextrepresentative

👥 A website for crowd-sourcing structured election candidate data
https://candidates.democracyclub.org.uk
GNU Affero General Public License v3.0
21 stars 27 forks source link

/elections is crazy slow with lots of elections #874

Open chris48s opened 5 years ago

chris48s commented 5 years ago

At the moment we've got about 5,000 current ballots and we have to load all of them on /elections . I think its not so much that we're having to do intensive queries - more that we are just chucking the browser a lot of content to render.

VirginiaDooley commented 3 years ago

This isn't an issue right now, but it may become one again in April.

chris48s commented 3 years ago

Sure - it isn't an issue right now, because there are only 48 current_or_future() ballots

I suspect this will be a problem again for the May 2022 elections because I think the underlying issue here is not fixed. If you want to try it locally you can set https://github.com/DemocracyClub/yournextrepresentative/blob/21024c22c7ab3c844a3f7918ee9ee0ea2d3eaad7/ynr/apps/elections/views.py#L65-L77 to something like

Ballot.objects.all()
    .select_related("election", "post")
    .prefetch_related("suggestedpostlock_set")
    .prefetch_related("officialdocument_set")
    .annotate(memberships_count=Count("membership", distinct=True))
    .annotate(
        elected_count=Count(
            "membership",
            distinct=True,
            filter=Q(membership__elected=True),
        )
    )
    .order_by("election__election_date", "election__name")
    .limit(5000)

(untested) which is not a massively useful query in itself, but will chuck out a realistic amount of data for the number of ballots you will have in April and give you a repro to work from.

symroe commented 3 years ago

I think the underlying issue here is not fixed

We should test this more, and clearly there will be more problems when there are more elections, however, this might help explain things a little more:

  1. I think when this issue was created there was a missing select_related or some other template + iteration O(N) performance problem. That exact problem was fixed in https://github.com/DemocracyClub/yournextrepresentative/pull/990.

  2. In addition to that change, the elections view is now cached for non-logged in users, meaning load on the server is much less — previously any request to that page would cause a huge query that would show in the server CPU graphs

  3. The PR above and https://github.com/DemocracyClub/yournextrepresentative/pull/1400 added filters to the view, so it's now much easier to get to the elections you want (the pre-filtered views are linked to from the home page around big elections)

  4. The performance wasn't really an issue over the 2021 elections, and they're a good test of "a tonne of elections"

Because of the above, this issue is really "rethink the UX of the elections page". We want to maintain the UX win that is ctrl+F for power users, allow new users to e.g find the next ballot that needs bulk adding, and various other things we know users do with that page, all while keeping performance nice and smooth regardless of the number of elections.

To expand the scope a little more, this should also include the UX of data discovery and historic elections: if we have a nice interface for looking for loads of current elections then we should make sure it works for previous elections, and link to the CSV/API versions of them.

I don't know if this issue is useful in that context, but I'm happy to keep it open to remind us to do some performance testing once we've made the May 22 elections.