yezyilomo / django-restql

Turn your API made with Django REST Framework(DRF) into a GraphQL like API.
https://yezyilomo.github.io/django-restql
MIT License
616 stars 43 forks source link

Support shared eager loading fields #133

Closed ashleyredzko closed 4 years ago

ashleyredzko commented 4 years ago

Sometimes, more than one field might want to share the same prefetch related. When doing this with strings (such as prefetch_related("related_thing")), multiple instances do not cause issues. However, when attempting to prefetch using a Prefetch object, it can cause issues if the same query is specified with the same to_attr.

Example:

prefetch_related = {
    "books": "books",
    "latest_course_book": Prefetch("books", queryset=Book.objects.all().order_by("-id"), to_attr="reversed_books"),
    "course_book_list": Prefetch("books", queryset=Book.objects.all().order_by("-id"), to_attr="reversed_books"),
}

Both of these fields care about either an order or the last book. Ordering by the reverse ID is ideal when prefetching then, but we don't want to do it more than once (which is what happens if we specify different to_attr) and providing the same to_attr causes an error. Instead, I propose a way to specify shared prefetch_related values to prevent this issue:

prefetch_related = {
    "books": "books",
}
shared_prefetch_related = [
    {
        "fields": ["latest_course_book", "course_book_list"],
        "prefetch": Prefetch("books", queryset=Book.objects.all().order_by("-id"), to_attr="reversed_books"),
    }
]
yezyilomo commented 4 years ago

While Prefetch doesn't allow setting the same to_attr you can still make both fields use the same prefetched results by setting the same source kwarg in serializers and prefetch only once on your view. It's like you create a shared cache where one field writes and the other reads. Here is an example

# CourseSerializer.py
latest_course_books = BookSerializer(source="common_name", many=True, read_only=True)
course_book_list = BookSerializer(source="common_name", many=True, read_only=True)
# CourseView.py
# No need to prefetch `latest_course_books` bcuz it's sharing the source with `course_book_list`

prefetch_related = {
    "books": "books",
    "course_book_list": Prefetch("books", queryset=Book.objects.all().order_by("-id"), to_attr="common_name")
}

Throwing an error when using the same to_attr makes sense to me because why would you want to do that?, cuz the first prefetch would create the attribute and store the result and then the second prefetch would update it with the same content but prefetched again which is meaningless(Exactly why Django is throwing an error telling you that lookup was already seen with a different queryset. You may need to adjust the ordering of your lookups), so the idea here is to let the first field create and update the attribute and let others consume the same result, that's what the code above does.

ashleyredzko commented 4 years ago

@yezyilomo If we were to query that endpoint with {-course_book_list}, latest_course_books wouldn't have a source. And if we always prefetch, it defeats the purpose of the eager loading mapping in the way we have it implemented.

yezyilomo commented 4 years ago

If we were to query that endpoint with {-course_book_list}, latest_course_books wouldn't have a source.

You are right, but I still don't see the point of having shared prefetch if at the end it's going to be just two fields with different names but same result, like if course_book_list and latest_course_books are going to be the same what's the point of separating them?, because from my point of view what shared prefetch does is that it allows creating a copy of a field with a different name.

ashleyredzko commented 4 years ago

if course_book_list and latest_course_books are going to be the same what's the point of separating them?

That's just an implementation example. You may need the same data set for more than one field.

yezyilomo commented 4 years ago

You may need the same data set for more than one field.

This is exactly my question why not use that one field rather than making copies and rename?..

ashleyredzko commented 4 years ago

It's still a case for using the same prefetched data for possibly very different fields. One case was we had a prefetch that benefited from having additional select_related on it, and we used that prefetch on two fields. We can't list each field separately for the prefetch_related because they clobber eachother, and putting them on different to_attrs meant we might possibly (usually) fetch the data twice. Our short term fix is to always prefetch, which somewhat defeats the purpose of using eager loading w/RestQL.

yezyilomo commented 4 years ago

Idk it seems like a pretty rare case, I think am struggling to like get a real world example where I would need to use this concept, I mean I get why we should not use the same or different to_attr when using this concept(Thanks to you for making it clear) but the answer to where exactly I would need to use this concept is still not clear to me.

ashleyredzko commented 4 years ago

@yezyilomo I agree this is a pretty niche edge case, and I'm struggling to find a reasonable example case. I'm going to close this, since I don't think the vast majority of people will run into this issue.

For those who do and want a possible solution, consider what Yezy has brought up. Maybe you don't need one of those fields, or they could be consolidated.