dimagi / django-cte

Common Table Expressions (CTE) for Django
Other
334 stars 46 forks source link

Can I count all children in Queryset annotate? #57

Closed johnvonneumann7 closed 2 years ago

johnvonneumann7 commented 2 years ago

I need to count the number of all children as below and use it in annotate, but Is this feasible in cte?

Comment.objects.filter(...).annotate(reply_count=???)
millerdev commented 2 years ago

I'm guessing a bit at what you're asking, but yes, that should be possible.

cte = With(
    Order.objects
    .values("region_id")
    .annotate(number=Count("id"))
)

orders = (
    cte.join(Order, region=cte.col.region_id)
    .with_cte(cte)
    .annotate(region_count=cte.col.number)
)
johnvonneumann7 commented 2 years ago

Thank you. I assume this will work successfully. I need to get the data for the other columns at the same time.

What if I need to include this number in the parent data to get a listing as shown below?

[
  {
    "id": 1,
    "text": "...",
    "reply_count": 10,
  },
  {
    "id": 2,
    "text": "...",
    "reply_count": 7,
  },
  {
    "id": 3,
    "text": "...",
    "reply_count": 3,
  },
]
millerdev commented 2 years ago

You haven't listed the models (tables) you're working with or their relationships, and I'd prefer not to keep guessing. Can you provide the SQL query you want to produce? That will make it easier to tell if you can create the same query with django-cte.

johnvonneumann7 commented 2 years ago

Sorry. The following are those.

To rearrange again, we need to get rows where the parent column is null, and need to annotate the number of all its descendants (all descendants of the hierarchy, not just the immediate children)

class CustomQuerySet(CTEQuerySet):
    ...

class Comment(models.Model):
    post = models.ForeignKey(Post, on_delete=models.CASCADE)
    text = models.TextField(max_length=1000)
    parent = models.ForeignKey(
        "self",
        blank=True,
        null=True,
        on_delete=models.CASCADE,
        related_name="replies",
    )

    objects = CustomQuerySet.as_manager()

I don't know the SQL statement that accomplishes what I want to do, but if I want to get the number of immediate children only Here is what it would look like

Comment.objects.filter(post_id=post).annotate(reply_count=Count("replies"))
SELECT "posts_comment"."id",
       "posts_comment"."text",
       "posts_comment"."parent_id",
       "posts_comment"."post_id",
       COUNT(T3."id") AS "reply_count"
FROM "posts_comment"
LEFT OUTER JOIN "posts_comment" T3 ON ("posts_comment"."id" = T3."parent_id")
WHERE ("posts_comment"."post_id" = 465a22f4-12de-4a4e-a294-88f4abfdec5c
       AND "posts_comment"."parent_id" IS NULL)
GROUP BY "posts_comment"."id"
millerdev commented 2 years ago

You can use a Recursive Common Table Expression to solve that problem. This sounds a bit like a homework problem, so I am stopping short of giving the answer.

johnvonneumann7 commented 2 years ago

Of course, I know how to use the basic recursive common table expression, and I can get all descendants of the parent table, but I don't know how to annotate the number of all descendants (how to use a recursive common table expression in annotate?)

I don't know how to code the "reply_count=???"

Comment.objects.filter(parent__isnull=True).annotate(reply_count=???)

I know that the code to retrieve all descendants would look like this

def make_cte(cte):
    return (
        Comment.objects.filter(parent=parent)
        .union(cte.join(Comment, parent_id=cte.col.id))
    )

cte = With.recursive(make_cte)

return cte.queryset().with_cte(cte)
johnvonneumann7 commented 2 years ago

I tried to annotate reply_count in this way, but reply_count is null for some reason.

def make_cte(cte):
    return Comment.objects.filter(parent=OuterRef("pk")).union(
        cte.join(Comment, parent_id=cte.col.id)
    )

cte = With.recursive(make_cte)

subquery = Subquery(
    cte.queryset()
    .with_cte(cte)
    .annotate(count=Count("pk"))
    .values("count"),
    output_field=IntegerField(),
)

Comment.objects.filter(post_id=post).annotate(reply_count=subquery)