Open bircpark opened 1 month ago
This partially depends on #973. But, it would also need an ability to update the OpenSearch index.
@bircpark , What source are you using in this case?
@dlvenable, My source is Dynamo DB using the Zero-ETL pipeline integration.
Based on my understanding, the ask here is for Data Prepper to make a call to PUT <index>/_mapping
to update the actual mappings file based on the user-defined input. This will allow modifications to an existing index as new fields are added.
Yes that is correct.
Is your feature request related to a problem? Please describe. Currently an OSIS pipeline seems to require either manual intervention or downtime to be taken when updating the mappings for an index, this includes adding subfields to an already existing field or a brand new field entirely.
Existing Configuration For Mapping
New Configuration For Mapping
The english subfield is not shown in the cluster and requires downtime or manual changes to be used.
Describe the solution you'd like It would be nice for OSIS pipelines to have the ability to update index mappings when they are updated in configuration. Once the updates are made having something like an
update_by_query
call or something similar to populate the new fields.Describe alternatives you've considered (Optional) a) A manual change to the mapping with an invocation of the update_by_query API to backfill records b) Take some downtime to stop the pipeline, delete the index, then restart the pipeline to re-sync data
Additional context The solution suggested is mainly concerned with updating subfields as
update_by_query
will only populate subfields of already existing fields and won't work for brand new fields being introduced to the mapping. For entirely new fields to the mapping you would need to run something else run (maybe like a Glue Job) to have the documents update reliably.