opensearch-project / opensearch-hadoop

Apache License 2.0
29 stars 23 forks source link

[BUG] default behavior is to refresh after batches. This should be removed #324

Open wbeckler opened 1 year ago

wbeckler commented 1 year ago

What is the bug?

When OPENSEARCH_BATCH_WRITE_REFRESH_DEFAULT = "true", the client will trigger a global refresh command. this forces a refresh of all indexes, even read-only ones, which results in an error. A refresh should be per index. This default setting is dangerous and was initially added to make the client more debuggable. It would be a better experience if the default were set to false, so as to remove this latent bug that eventually causes issues.

https://github.com/opensearch-project/opensearch-hadoop/blob/203d53f3f5c052c27691a079f684433f51035f02/mr/src/main/java/org/opensearch/hadoop/cfg/ConfigurationOptions.java#L97

mukhtarGeo commented 1 year ago

Hi, This issue happened to us when using default settings of hadoop-client which sets the refresh as "true". Pls set the default to false so write operations into index do not fail.