Open SavvasSriAnushaVeeramachineni opened 2 months ago
@SavvasSriAnushaVeeramachineni , Thank you for opening this issue. I understand that you'd like Data Prepper to automatically call the _refresh
API for every updated index.
Can you clarify what will try making that call? Are you using S3-scan? Do you want the completion of the scan to trigger the refresh?
As a result of this behavior - there is a delay in the data being available even though the ingestion to OpenSearch is complete.
What is your delay?
Also, have you tried using the default refresh_interval
to let OpenSearch handle it?
@dlvenable Thanks for Replying! Regarding : Can you clarify what will try making that call? Are you using S3-scan? Do you want the completion of the scan to trigger the refresh?
What is your delay?
Also, have you tried using the default refresh_interval to let OpenSearch handle it?
@dlvenable Do you have any suggestion/solution for the requirement we are looking for?
Is there any plan to pick the enhancement request in the near future?
Hi @dlvenable : Good day, do we have any traction on the above use case ?
Is your feature request related to a problem? Please describe. Currently our ETL job runs every 30 minutes and inserts a file into S3, triggering OpenSearch ingestion pipeline. Due to varying ETL completion time, it's challenging to determine suitable
refresh_interval
at the index level that works consistently for all scenarios.As a result of this behavior - there is a delay in the data being available even though the ingestion to OpenSearch is complete.
Describe the solution you'd like We propose to add a new configuration option for http post-processor hooks in the Data Prepper pipeline definition, which will allow us to specify the http POST endpoint and make refresh API call( /index-name/_refresh), post pipeline ingestion is completed.
Currently the processor available in the pipeline definition only works before ingesting data to OpenSearch.
Describe alternatives you've considered (Optional) Provide refresh option at pipeline index settings which will internally refresh the index after the execution of pipeline.
Additional context N/A