Limit the size of pk__in lists when fetching related objects

jtharpla commented 6 years ago

When fetching related objects by ID in fetch_related we do not currently place any bounds on the the number of IDs to fetch at once (https://github.com/closeio/flask-common/blob/master/flask_common/documents.py#L291). This can result in MongoDB queries with 800 - 1000 IDs in one $in list (pk__in in MongoEngine). Executing such queries with more than 100 - 200 IDs in a single query can result in considerable CPU on the mongod node that's handling the query. Maybe we could execute this query in batches of 100 - 200 IDs at a time (ideally 100) and then combine the results into a single list.

jtharpla commented 6 years ago

@tsx I believe this captures what we discussed?

tsx commented 6 years ago

Yep. See also the issue in the main repo ^

closeio / flask-common

Limit the size of pk__in lists when fetching related objects #45