Open dzhelezov opened 3 years ago
About the query performance, If I understand you correctly.
There are only two queries for fetching the Author
and all the posts by that author. For the relationships, we use field resolvers and we do left join
there. Here is an example query, (the channel I am querying has two videos):
Graphql query
query {
channel(where:{id:"84"}) {
handle
videos {
title
}
}
}
Generated sql queries
query: SELECT "channel"."id" AS "channel_id", "channel"."handle" AS "channel_handle" FROM "channel" "channel" WHERE "channel"."id" = $1 AND "channel"."deleted_at" IS NULL LIMIT 1 -- PARAMETERS: ["84"]
query: SELECT "Channel"."id" AS "Channel_id", "Channel"."created_at" AS "Channel_created_at", "Channel"."created_by_id" AS "Channel_created_by_id", "Channel"."updated_at" AS "Channel_updated_at", "Channel"."updated_by_id" AS "Channel_updated_by_id", "Channel"."deleted_at" AS "Channel_deleted_at", "Channel"."deleted_by_id" AS "Channel_deleted_by_id", "Channel"."version" AS "Channel_version", "Channel"."handle" AS "Channel_handle", "Channel"."description" AS "Channel_description", "Channel"."cover_photo_url" AS "Channel_cover_photo_url", "Channel"."avatar_photo_url" AS "Channel_avatar_photo_url", "Channel"."is_public" AS "Channel_is_public", "Channel"."is_curated" AS "Channel_is_curated", "Channel"."language_id" AS "Channel_language_id", "Channel"."happened_in_id" AS "Channel_happened_in_id", "Channel__videos"."id" AS "Channel__videos_id", "Channel__videos"."created_at" AS "Channel__videos_created_at", "Channel__videos"."created_by_id" AS "Channel__videos_created_by_id", "Channel__videos"."updated_at" AS "Channel__videos_updated_at", "Channel__videos"."updated_by_id" AS "Channel__videos_updated_by_id", "Channel__videos"."deleted_at" AS "Channel__videos_deleted_at", "Channel__videos"."deleted_by_id" AS "Channel__videos_deleted_by_id", "Channel__videos"."version" AS "Channel__videos_version", "Channel__videos"."channel_id" AS "Channel__videos_channel_id", "Channel__videos"."category_id" AS "Channel__videos_category_id", "Channel__videos"."title" AS "Channel__videos_title", "Channel__videos"."description" AS "Channel__videos_description", "Channel__videos"."duration" AS "Channel__videos_duration", "Channel__videos"."skippable_intro_duration" AS "Channel__videos_skippable_intro_duration", "Channel__videos"."thumbnail_url" AS "Channel__videos_thumbnail_url", "Channel__videos"."language_id" AS "Channel__videos_language_id", "Channel__videos"."media_id" AS "Channel__videos_media_id", "Channel__videos"."has_marketing" AS "Channel__videos_has_marketing", "Channel__videos"."published_before_joystream" AS "Channel__videos_published_before_joystream", "Channel__videos"."is_public" AS "Channel__videos_is_public", "Channel__videos"."is_curated" AS "Channel__videos_is_curated", "Channel__videos"."is_explicit" AS "Channel__videos_is_explicit", "Channel__videos"."license_id" AS "Channel__videos_license_id", "Channel__videos"."happened_in_id" AS "Channel__videos_happened_in_id", "Channel__videos"."is_featured" AS "Channel__videos_is_featured" FROM "channel" "Channel" LEFT JOIN "video" "Channel__videos" ON "Channel__videos"."channel_id"="Channel"."id" WHERE "Channel"."id" IN ($1) -- PARAMETERS: ["84"]
Since each video has a reference to a channel the following query will result in making N (number of videos) queries:
query {
videos {
title
channel {
handle
}
}
}
UPD: the N+1 queries issue arises only for 2+ level queries, e.g.
query {
channels {
title
video {
title
license {
id
}
}
}
}
At the moment we don't have any meaningful server-side caching for Hydra GraphQL server API. For any non-trivial load, this would lead to poor query performance.
Author
by ID and then allPosts
by the that author. The current implementation of resolvers would first fetch the author entity by ID, join the posts and then each post is fetched by ID. This leads to the N+1 queries problem and very poor performance (e.g. the initial hydra query sent by Atlas takes about 500ms). In order to mitigate this issue, Dataloaders should be used instead of direct SQL queries in the relationship field resolver. This will replace N+1 queries with just two, and moreover enable caching for even better performance.