elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
977 stars 24.82k forks source link

Update Existing Document using the op_type in injest_pipeline #114393

Open ikishorkumar opened 3 weeks ago

ikishorkumar commented 3 weeks ago

Elasticsearch Version

8.15.2

Installed Plugins

No response

Java Version

bundled

OS Version

Windows/Linux

Problem Description

I have 10000+ the filebeats implemented in the systems.

  1. All data is being ingested from all those fielbeats into my elastic clustor.
  2. i use the ingest pipeline to create fingerprint in order to provide the doc_id manually .
  3. now when same type doc id comes it should update the existing doc on the basis of the _id but its not doing that .

This feature is very necessary as many of us want this feature to be available in the ingest pipelines not using Logstash or filebeats as they are so many and we can't make our clients to reconfigure everything.

Steps to Reproduce

PUT _ingest/pipeline/fingureprints-pipeline { "description": "My optional pipeline description", "processors": [ { "lowercase": { "field": "full_name", "target_field": "full_name" } }, { "fingerprint": { "fields": [ "full_name", "person_id" ], "target_field": "_id", "method": "SHA-1" } }, { "set": { "field": "_op_type", "value": "update" } } ] }

error i receive { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[LT3+KJ+gZLeq5RP+BlUYF3j4ff4=]: version conflict, document already exists (current version [1])", "index_uuid": "QMJVFUkoSVCMfAl_UMuYXA", "shard": "0", "index": "person-data-new-fingureprints" } ], "type": "version_conflict_engine_exception", "reason": "[LT3+KJ+gZLeq5RP+BlUYF3j4ff4=]: version conflict, document already exists (current version [1])", "index_uuid": "QMJVFUkoSVCMfAl_UMuYXA", "shard": "0", "index": "person-data-new-fingureprints" }, "status": 409 }

Logs (if relevant)

what ever data you take

elasticsearchmachine commented 3 weeks ago

Pinging @elastic/es-data-management (Team:Data Management)