yorkie-team / yorkie

Yorkie is a document store for collaborative applications.
https://yorkie.dev
Apache License 2.0
771 stars 143 forks source link

Resolve N+1 issue in GetDocuments API response when snapshot is included #933

Open kokodak opened 1 month ago

kokodak commented 1 month ago

What would you like to be added:

Currently, when the GetDocuments API endpoint is called and the response needs to include a snapshot, an N+1 issue occurs. This problem arises because packs.BuildDocumentForServerSeq() is called internally N times.

Within packs.BuildDocumentForServerSeq(), two database queries are executed, leading to the N+1 problem. To resolve this issue, the database query within this method should be converted to a bulk version.

If there is a need for a snapshot preview when displaying documents in CodePair, resolving this issue could result in performance improvements.

https://github.com/yorkie-team/yorkie/blob/a4ce314d6e6091049402f8f8876cedd0c516f0c0/server/documents/documents.go#L136-L144

Why is this needed:

To address the N+1 problem in the GetDocuments API response to enhance performance.

kokodak commented 1 month ago

https://github.com/kokodak/yorkie/commit/e6008910669be6695180b2922f8d5b5c659eb8bf

The attached commit is the code for a bulk query implementation I've worked on before.

In the case of FindClosestSnapshotInfo() bulk, the required query is rather complicated, so I configured the query by configuring the Aggregate pipeline. When sorting and grouping in the pipeline, it would be good to consider the DB Index applied to the Snapshot Collection.

For the bulk of FindChangesBetweenServerSeqs(), there is a high probability of memory related issues, as it is looking for a very large number of Changes if a Snapshot is missing, and we need to be aware of this issue.