Reindex allows users to create new indexes with data that is already in elasticsearch. This is especially useful for moving to semantic search because users often have already implemented text search and want to embed their existing data in a new index. Unfortunately, reindex has some flaws that make it difficult or impossible to use for larger datasets and when using machine learning models to produce embeddings.
If the inference processor isn't configured correctly (which means adding specific code which isn't there by default) any errors will not be recorded anywhere
Background
Reindex allows users to create new indexes with data that is already in elasticsearch. This is especially useful for moving to semantic search because users often have already implemented text search and want to embed their existing data in a new index. Unfortunately, reindex has some flaws that make it difficult or impossible to use for larger datasets and when using machine learning models to produce embeddings.
Problems
Resiliency - Issues with failures and errors
Issues with size
Issues with performance
Issues with scroll
Possible solutions in the works?
https://github.com/elastic/elasticsearch/issues/27724#issuecomment-2101539332