mitjale / lucenetransform

Automatically exported from code.google.com/p/lucenetransform
0 stars 0 forks source link

IOException thrown from TransformedIndexInput while reading compressed file #5

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Our application is using Lucene with LuceneTransform to compress, but not 
encrypt, the stored files.
2. On apparently random files, when a query needs to access the file, we are 
seeing an IOException thrown, as shown below.
3.

What is the expected output? What do you see instead?

java.io.IOException: Invalid compression chunk location 131072!=4
        at org.apache.lucene.store.transform.TransformedIndexInput.readDecompressImp(TransformedIndexInput.java:471)
        at org.apache.lucene.store.transform.TransformedIndexInput.readDecompress(TransformedIndexInput.java:430)
        at org.apache.lucene.store.transform.TransformedIndexInput.readByte(TransformedIndexInput.java:537)
        at org.apache.lucene.store.DataInput.readVInt(DataInput.java:105)
        at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
        at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:133)
        at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:174)
        at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:236)
        at org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java:304)
        at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java:464)
        at org.apache.lucene.search.NumericRangeQuery$NumericRangeTermEnum.next(NumericRangeQuery.java:565)
        at org.apache.lucene.search.NumericRangeQuery$NumericRangeTermEnum.<init>(NumericRangeQuery.java:507)
        at org.apache.lucene.search.NumericRangeQuery.getEnum(NumericRangeQuery.java:313)
        at org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:107)
        at org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139)
        at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:383)
        <stack trace truncated>

What version of the product are you using? On what operating system?

We are using LuceneTransform-0.9.2.2.jar on a 64-bit Linux platform.

Please provide any additional information below.

We've looked at the source code change list don't think that change r57 will 
help us, even though issue 3 looks similar to ours (based on the exception 
being thrown). Change r57 looks like it is only going to apply to encrypted 
files; ours are compressed, but not encrypted.

We are wondering if our issue might have been fixed by r58?

Original issue reported on code.google.com by dodn...@gmail.com on 16 Nov 2012 at 1:53

GoogleCodeExporter commented 9 years ago
The check in notice on r58 is this: "Fixed to short buffer problem when 
encrypting short data". Could that lead to the IOException we are seeing when 
the file is decompressed?

Original comment by dodn...@gmail.com on 16 Nov 2012 at 9:35

GoogleCodeExporter commented 9 years ago
We have found the problem and have a proposed fix. The method 
TransformedIndexInput.java needs to be repaired so that the if-statement at 
line 379 (from source revision 0.9.2.2):

// for performance reason check, next chunk if it is on correct location
if (chunkPos + 1 < inflatedPositions.length) {
    if (inflatedPositions[chunkPos + 1] == bufferPos) {
        chunkPos++;
        if (input.getFilePointer() != chunkPositions[chunkPos]) {
            input.seek(chunkPositions[chunkPos]);
        }
        return 0;
    } else {
        // EOF
        throw new EOFException();
    }

Original comment by dodn...@gmail.com on 21 Nov 2012 at 11:08

GoogleCodeExporter commented 9 years ago
Correction to my previous comment. It's the seekToChunk() method in class 
TransformedIndexInput that needs to be repaired.

Original comment by dodn...@gmail.com on 21 Nov 2012 at 11:33