There is a scaling problem with our DeleteOrphanedObjects job. It currently cannot handle the large queries that are created from prod/staging platforms. The limit of 16MB per document can happen fairly easily once there are 1,000s/10,000s of namespaces.
The new approach is to leverage aggregate pipelines in mongodb to handle these large requests. It may be enough to just put the request in the pipeline to get past the 16MB document limit restriction, but regardless this approach can now easily break the request into multiple stages. Aggregate pipelines can handle up to 1000 stages before it hits a limit itself (with each stage allowing up to or past 16MB per stage): https://www.mongodb.com/docs/manual/core/aggregation-pipeline-limits/
Description
There is a scaling problem with our
DeleteOrphanedObjects
job. It currently cannot handle the large queries that are created from prod/staging platforms. The limit of 16MB per document can happen fairly easily once there are 1,000s/10,000s of namespaces.The new approach is to leverage aggregate pipelines in mongodb to handle these large requests. It may be enough to just put the request in the pipeline to get past the 16MB document limit restriction, but regardless this approach can now easily break the request into multiple stages. Aggregate pipelines can handle up to 1000 stages before it hits a limit itself (with each stage allowing up to or past 16MB per stage): https://www.mongodb.com/docs/manual/core/aggregation-pipeline-limits/