Open vjeffrey opened 5 years ago
The Elasticsearch Bulk API cannot be currently used for the compliance report ingestion. The reason the Bulk API cannot be used is that for each report an elasticsearch update-by-query is run to unmark the previous report from being the latest. The reason config-mgmt-service does not have this problem is that it has a separate index that only contains the latest run. Updating one document with its ID in this index can use the Bulk API. The reason update-by-query cannot use the Bulk API is, it is searching for the document that needs to be changed.
User Story
In https://github.com/chef/a2/pull/5067 the client runs ingestion pipeline was modified to use the elasticsearch bulk api with bundled messages. This is a huge improvement for the pipeline, so let's implement it in the compliance ingestion pipeline too. Please see https://github.com/chef/a2/pull/5067/files for more details.
Definition of Done
compliance ingestion uses es bulk api/bundles msgs