Optimize queries for tree visualization

Now that users can add new taxa to the database, we need a way to only show in the public taxonomic tree taxa that are “public” (i.e., taxa that have published media files).

Formerly, the tree was generated using all the taxa in the database. That's because a taxon was only added when a media tagged with it was published. However, now a taxon can be added before its associated media files are published. Old queries:

# meta/templatetags/extra_tags.py
taxa = Taxon.objects.select_related('parent')
# meta/views.py
genera = Taxon.objects.filter(rank_en='Genus').order_by('name')

As of d61536f9b6a04b0e3c9fcc6b9fc6a57f8e3b73bc, new queries are in place. They work, but take really long to run. Checking the status of every media of every taxon and ancestors is inefficient.

# meta/templatetags/extra_tags.py
taxa = Taxon.objects.filter(media__status='published').get_ancestors(include_self=True)
# meta/views.py
genera = Taxon.objects.filter(media__status='published').get_ancestors(include_self=True).filter(query)

The best solution I see so far is adding a is_public field to the Taxon model, which is False by default and only becomes True (for the taxon and its ancestors) when the associated media is published. The queries would then be:

# meta/templatetags/extra_tags.py
taxa = Taxon.objects.filter(is_public=True).select_related('parent')

Moreover, we should consider removing the species list from the taxa_page, it's not very useful. And remember that MPTT is no longer maintained (#167).

bruvellu / cifonauta

Optimize queries for tree visualization #300