Open damienalexandre opened 3 years ago
Pinging @elastic/es-distributed (Team:Distributed)
Any update on this?
Very annoying bug. Spent couple of hours thinking that script
simply doesn't work.
Why can't we have same index name as source doc? At least it should throw an error instead of fallback to default index name provided in dest.index
Elasticsearch version (
bin/elasticsearch --version
): 7.8.1 (but affect all version)Plugins installed: []
JVM version (
java -version
): bundledOS version (
uname -a
if on a Unix-like system): 20.04.1-UbuntuDescription of the problem including expected versus actual behavior:
When using the Reindex API to move multiple indices from a remote to a local cluster, we can used a wildcard in the source index name parameter.
Then in the destination, we cannot but that's ok because a script can be used.
But if your script set the exact same index name from the remote index, it will completely by ignored and all yours documents are going to be sent to only and only one index, without any warning or error.
I had a hard time figuring out why my script wasn't working and narrowed it down to this:
https://github.com/elastic/elasticsearch/blob/2dbd59bbe167b1942c9725693cc1e600856d3554/modules/reindex/src/main/java/org/elasticsearch/index/reindex/AbstractAsyncBulkByScrollAction.java#L762-L764
If the index name set by a script is the same as the index name from the document, nothing is updated. That's probably good when the source is not a wildcard, but when it is, that's problematic!
Steps to reproduce:
Example adapted from the documentation:
We expect only 2 indices with a document each (reindexed "in place").
We get 3 indices, a new
metricbeat
indice is created unexpectedly.No warning or errors are triggered.
Ref #18654 #19662