Linux 2f78a71fa18c 5.10.104-linuxkit #1 SMP Thu Mar 17 17:08:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
When using upserts inside a bulk request with an ingest pipeline, the {{{_id}}} template snippet does not resolve to the provided document id. This is a divergence from the regular, non-bulk upsert command where this template snippet does get resolved.
Some context / motivation
According to Elastic documentation, the recommended way for pagination is using search_after and it is recommended using a tiebreaker field in sort (see here). A natural candidate meeting the uniqueness requirement is the document _id, but since it is not defined as a doc_values according to the documentation:
_The _id field is restricted from use in aggregations, sorting, and scripting. In case sorting or aggregating on the _id field is required, it is advised to duplicate the content of the _id field into another field that has doc_values enabled._
Therefore, not supporting this use case means that we cannot use search_after if our system requires bulk upserts to be done.
Elasticsearch Version
7.10.2
Installed Plugins
No response
Java Version
openjdk version "15.0.1" 2020-10-20
OS Version
Linux 2f78a71fa18c 5.10.104-linuxkit #1 SMP Thu Mar 17 17:08:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
When using upserts inside a bulk request with an ingest pipeline, the
{{{_id}}}
template snippet does not resolve to the provided document id. This is a divergence from the regular, non-bulk upsert command where this template snippet does get resolved.Some context / motivation According to Elastic documentation, the recommended way for pagination is using
search_after
and it is recommended using a tiebreaker field insort
(see here). A natural candidate meeting the uniqueness requirement is the document_id
, but since it is not defined as a doc_values according to the documentation: _The_id
field is restricted from use in aggregations, sorting, and scripting. In case sorting or aggregating on the_id
field is required, it is advised to duplicate the content of the_id
field into another field that hasdoc_values
enabled._ Therefore, not supporting this use case means that we cannot usesearch_after
if our system requires bulk upserts to be done.Steps to Reproduce
Setup Index
Run bulk update
Check document
Actual Result Document
1
got upserted, and bothdefault
andfinal
pipelines got executed. However, thedocId
was not set.Expected Result I'd expect the result of the
bulk
operation above to be the same as of the followingupsert
operation that does set thedocId
field:Logs (if relevant)
No response