feincms / django-tree-queries

Adjacency-list trees for Django using recursive common table expressions. Supports PostgreSQL, sqlite, MySQL and MariaDB.
https://django-tree-queries.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
415 stars 27 forks source link

Admin: adding with_tree_fields breaks the search in the list view #56

Closed nuarhu closed 1 year ago

nuarhu commented 1 year ago

Using version 0.15

Hi there,

first, thanks for this useful library!

It took me a while to realize what a nice feature "with_tree_fields" is and that it already comes with ordering by custom text value. This has the great effect that the list in the admin is already sorted by full paths.

I was greedy and thought that this will also enable the search on the full paths. However, now that I've changed the queryset to always include "with_tree_fields" (for sorting), it's not possible to search in the admin list view anymore.

My model:

class TranslationKey(TreeNode):
    label = CharField(
        null=False,
        blank=False,
        max_length=255,
        db_index=True
    )
    objects = TreeQuerySet.as_manager(with_tree_fields=True)

    class Meta:
        ordering = ['label']
        unique_together = ['parent', 'label']

    @property
    def full_label(self):
        tree_ordering = getattr(self, 'tree_ordering', None)
        if tree_ordering:
            # this is only present for objects fetched via TranslationKey.objects
            return '.'.join(tree_ordering)
        return '.'.join([tk.label for tk in self.ancestors(include_self=True)])
```python

The admin:

```python
class TranslationKeyAdmin(ModelAdmin):
    list_display = ['full_label']
    search_fields = ['label', 'translation__value']
    readonly_fields = ('full_label',)

The error is the following:

[2023-08-17 17:40:36,809] ERROR - basehttp "GET /translations/groupedtranslation/?q=effect HTTP/1.1" 500 211263
[2023-08-17 17:40:43,584] ERROR - log Internal Server Error: /translations/groupedtranslation/
Traceback (most recent call last):
  File "/XXX/venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "v0"
LINE 101: ...e".tree_ordering) ASC LIMIT 1) AND (__tree.tree_pk = V0.id))

(Explaining the URL: this model is configured as "GroupedTranslations" (via proxy) in the admin.)


Including the output for queryset.query dumped via overriden get_search_results in the Admin class:

class TranslationKeyAdmin(ModelAdmin):
    # omitted config (see above)

    def get_search_results(self, request, queryset, search_term):
        queryset, may_have_duplicates = super().get_search_results(request, queryset, search_term)
        LOGGER.debug('>>>>>>>>>>>>>>>>>< queryset %s', queryset.query)
        return queryset, may_have_duplicates
[2023-08-17 17:56:04,592] DEBUG - base >>>>>>>>>>>>>>>>>< queryset 
    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["label"]::text[] AS tree_ordering,
            T."id"
        FROM translations_translationkey T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "label"::text,
            T."id"
        FROM translations_translationkey T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "translations_translationkey"."id", "translations_translationkey"."parent_id", "translations_translationkey"."label" FROM "translations_translationkey" LEFT OUTER JOIN "translations_translation" ON ("translations_translationkey"."id" = "translations_translation"."key_id") , "__tree" WHERE (NOT (EXISTS(
    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["label"]::text[] AS tree_ordering,
            T."id"
        FROM translations_translationkey T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "label"::text,
            T."id"
        FROM translations_translationkey T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT 1 AS "a" FROM "translations_translationkey" U0 LEFT OUTER JOIN "translations_translation" U1 ON (U0."id" = U1."key_id") , "__tree" WHERE (U1."id" IS NULL AND U0."id" = ("translations_translationkey"."id") AND (__tree.tree_pk = U0.id)) ORDER BY ("__tree".tree_ordering) ASC LIMIT 1)) AND (UPPER("translations_translationkey"."label"::text) LIKE UPPER(%effect%) OR UPPER("translations_translation"."value"::text) LIKE UPPER(%effect%)) AND (__tree.tree_pk = translations_translationkey.id)) ORDER BY ("__tree".tree_ordering) ASC
[2023-08-17 17:56:04,779] ERROR - log Internal Server Error: /translations/groupedtranslation/
Traceback (most recent call last):
  File "/XXX/venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "v0"
LINE 101: ...ranslationkey"."id")) LIMIT 1) AND (__tree.tree_pk = V0.id))

Thanks again!

Python 3.11.4 Django 4.2.3 django-tree-queries 0.15.0


EDIT (trying to find a work around):

Search works with the following manager:

objects = TreeQuerySet.as_manager()

but if I change that back to

objects = TreeQuerySet.as_manager(with_tree_fields=True)

and then override get_search_results in the Admin to reset the queryset, it is still broken:

    def get_search_results(self, request, queryset, search_term):
        LOGGER.debug('>>>>>>>>>>>>>> qs type %s', type(queryset))
        queryset, may_have_duplicates = super().get_search_results(request, queryset.without_tree_fields(), search_term)
        LOGGER.debug('>>>>>>>>>>>>>>>>>< queryset %s', queryset.query)
        return queryset, may_have_duplicates

Looking at the code, I really would have thought that TreeQuerySet.as_manager() and TreeQuerySet.as_manager(with_tree_fields=True).without_tree_fields() would be equivalent but it seems not.

nuarhu commented 1 year ago

Here is my work around - unfortunately, it does only work as long as there is no filter active:

class TranslationKeyAdmin(ModelAdmin):
    # omitted config

    def get_search_results(self, request, queryset, search_term):
        qs_1, may_have_duplicates = super().get_search_results(request, queryset, search_term)
        bits = smart_split(search_term)
        ids_2 = [n.pk for bit in bits for n in queryset.all() if bit in n.full_label]
        queryset = queryset.filter(
            pk__in=set(list(qs_1.values_list('pk', flat=True)) + ids_2))
        return queryset, False

Explanation:

The break occurs whenever may_have_duplicates is True and the respective code in Django is triggered. The resulting changes break the query.

The workaround circumvents this by simply collecting the PKs for all hits, and querying them explicitly. This query won't ever return duplicates so it's save to return False from the query. It's not really generic a generic solution but so far, looks good and is fast (we don't have much data, though).

But this does not resolve the problem completely: once a list filter is added (another query param, and more queryset actions occur), it breaks again.

Note: I tried to create a composed query, first, (with queryset |= ....) but this failed with:

ValueError: When merging querysets using 'or', you cannot have extra(select=...) on both sides.
matthiask commented 1 year ago

Thanks for the nice words :)

Looking at the code, I really would have thought that TreeQuerySet.as_manager() and TreeQuerySet.as_manager(with_tree_fields=True).without_tree_fields() would be equivalent but it seems not.

That's indeed surprising. I wonder what's going on here? cls._with_tree_fields is only accessed in TreeManager.get_queryset so I don't really understand why .without_tree_fields wouldn't reverse all effects.

The workaround circumvents this by simply collecting the PKs for all hits, and querying them explicitly. This query won't ever return duplicates so it's save to return False from the query. It's not really generic a generic solution but so far, looks good and is fast (we don't have much data, though).

Yes, that's the thing I do as well when I hit issues with table aliassing etc.

The problem is that Django doesn't really understand what django-tree-queries is doing. Something like https://github.com/dimagi/django-cte/ which integrates deeply with Django's Query class would probably work better for advanced use cases, but it wasn't ready (IIRC) when I implemented django-tree-queries initially and it has generally worked so well that I didn't have a good reason to look at django-cte again in the meantime, except out of cursory interest.

Regarding your report: I'm not sure if you're suggesting to change anything, or maybe to add something to the documentation? I generally think workarounds are fine since Django is a web framework for perfectionists with deadlines 😎

nuarhu commented 1 year ago

Hi @matthiask,

concerning why TreeQuerySet.as_manager() and TreeQuerySet.as_manager(with_tree_fields=True).without_tree_fields() are not equivalent:

That is probably because TreeQuerySet.as_manager runs only once - when the model is imported, and it sets manager_class._with_tree_fields = with_tree_fields at that point. This will never get changed later on.

At this point, I've got my code to work with the previously described work around for the search and by falling back to the base manager (._base_manage) in my custom Admin filter. My code samples here might help others to find a way around this so maybe it's already "kind of" documented - or it could be linked from the documentation.

I'm still trying to wrap my head around CTEs so I'm in no way confident enough to try to resolve this better or maybe integrate django-cte or ...

tl;dr - I'm closing this, may it be of help to others. Thanks again!