MTG / freesound

The Freesound website
https://freesound.org
GNU Affero General Public License v3.0
311 stars 87 forks source link

Optimize display_pack templatetag #1387

Open ffont opened 4 years ago

ffont commented 4 years ago

display_pack is currently not as efficient as we'd like as we're performing a number of extra queries per pack. Also, we're using an HTML cache per Pack display which is not really working well because some some queries are anyways evaluated in the view code of the templatetag.

Displaying a pack is not straightforward because we need to obtain pack information, select 3 sounds randomly from the pack and get information about these sounds to show on screen (some aggregated, like the list of most used tags in a pack).

We need to define a strategy to follow every time we want to show a list of packs on screen. After some internal discussions these are the three options we'd like to test when we want to display n:

Option A 1+n queries per page, 1 query after cached (this is the most similar to current approach)

* packs view:

qs = Pack.objects.filter(...)  <- query 0, not yet evaluated
paginate(request, qs, settings.PACKS_PER_PAGE) <- query 0 gets evaluated

* packs template:

{% cache packs_list ... %}
{% for pack in page.object_list %} <- this evaluates query 0
    {% display_pack pack %} <- makes one extra query per pack to get all sounds' info
{% end for %}
{% end cache %}

Option B 2-3 queries per page, 1 query after cached

* packs view:

qs = Pack.objects.filter(...)  <- query 0, not yet evaluated
paginate(request, qs, settings.PACKS_PER_PAGE) <- query 0 gets evaluated

* packs template:

{% cache packs_list ... %}
{% get_packs_data page.object_list as packs %} <- have some tag which gets information about all sounds to be displayed at once (with 1 or 2 more queries to get sound IDs per pack and then sound info for the selected sounds)
{% for pack in packs %} 
    {% display_pack pack %} <- makes no more queries
{% end for %}
{% end cache %}

Option C 2 queries per page, 1 query after cached

* packs view:

qs = Pack.objects.filter(...).annotate(sound_ids)  <- query 0, not yet evaluated, includes sound ids per pack
paginator = paginate(request, qs, settings.PACKS_PER_PAGE) <- query 0 gets evaluated
page = paginator["page"]
sound_ids_to_display = [sample(pack.sound_ids, 3) for pack in page] <- get info from qs, already evaluated
sounds_info = Sound.objects.filter(id__in) <- query 1, not yet evaluated

* packs template:

{% cache packs_list ... %}
{% something to trigger evaluation of sounds_info %}
{% for pack in page.object_list %} 
    {% display_pack pack sounds_info %} <- makes no more queries
{% end for %}
{% end cache %}

We should see which one is faster for different numbers of packs and implement it. These options assume that we remove the cache for individual packs and instead we use caches for the most visited pack list blocks like those in the packs page. Removing individual cache in packs will also simplify the invalidate cache methods (which also need to be reviewed and modified accordingly).

Summary of things to do for this PR:

ffont commented 9 months ago

This one has changed a a lot with the new UI but still it should be optimized as currently it is very slow.