Closed benwtrent closed 1 month ago
I am having a difficult time figuring out how to fix this. It seems to me that if the segment is "hard deleted", we should reset all its FieldInfos as there isn't any data written in it at all.
But, I am not sure the individual processDoc action can do this as it only knows about the documents it added.
What makes matters worse, is that it doesn't even have to be ALL docs that failed, just some of them that had point values (or knn vector values, etc.). Anything that eagerly updates FieldInfos but don't actually get flushed could trigger this weird behavior when opening the NRT reader.
Description
There has been a nasty test failure in ES for awhile: https://github.com/elastic/elasticsearch/issues/105122
The test simulates a document indexing failure. It turns out, that this test failure is caused by a series of strange conditions in Lucene. If we fail on indexing a field, but have points value field that comes AFTER the field that is indexing, things will blow up when opening a reader if the writer has soft-deletes enabled.
The failure description is as follows:
Test that replicates the failure
```java public void testExceptionJustBeforeFlushWithPointValues() throws Exception { Directory dir = newDirectory(); Analyzer analyzer = new Analyzer(Analyzer.PER_FIELD_REUSE_STRATEGY) { @Override public TokenStreamComponents createComponents(String fieldName) { MockTokenizer tokenizer = new MockTokenizer(MockTokenizer.WHITESPACE, false); tokenizer.setEnableChecks( false); // disable workflow checking as we forcefully close() in exceptional cases. TokenStream stream = new CrashingFilter(fieldName, tokenizer); return new TokenStreamComponents(tokenizer, stream); } }; DirectoryReader r = null; IndexWriterConfig iwc = newIndexWriterConfig(analyzer).setCommitOnClose(false).setMaxBufferedDocs(3); MergePolicy mp = iwc.getMergePolicy(); iwc.setMergePolicy( new SoftDeletesRetentionMergePolicy("soft_delete", MatchAllDocsQuery::new, mp)); IndexWriter w = RandomIndexWriter.mockIndexWriter(dir, iwc, random()); Document newdoc = new Document(); newdoc.add(newTextField("crash", "do it on token 4", Field.Store.NO)); newdoc.add(new IntPoint("int", 17)); expectThrows(IOException.class, () -> w.addDocument(newdoc)); try { r = w.getReader(false, false); } catch (AlreadyClosedException ace) { // expected } dir.close(); } ```The exception thrown is:
Version and environment details
No response