mitjale / lucenetransform

Automatically exported from code.google.com/p/lucenetransform
0 stars 0 forks source link

NPE on org.apache.lucene.store.BufferedIndexInput.bufferSize #6

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello Everybody,

i try to create an encrypted lucene index in a small java programm running on 
MacOS 10.9.5 witt lucene transform version 4.0.0. beta and lucene core version 
4.0.0. As a test data i use 15000 text files. The bigest one is 385 KB, the 
smallest 20KB. 

When I run the application a npe is thrown:

java.lang.NullPointerException
    at org.apache.lucene.store.BufferedIndexInput.bufferSize(BufferedIndexInput.java:345)
    at org.apache.lucene.store.BufferedIndexInput.<init>(BufferedIndexInput.java:60)
    at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:454)
    at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:122)
    at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:80)
    at org.apache.lucene.store.transform.TransformedDirectory.fileLength(TransformedDirectory.java:184)
    at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:116)
    at org.apache.lucene.index.IndexWriter.prepareFlushedSegment(IndexWriter.java:2141)
    at org.apache.lucene.index.DocumentsWriter.publishFlushedSegment(DocumentsWriter.java:509)
    at org.apache.lucene.index.DocumentsWriter.finishFlush(DocumentsWriter.java:487)
    at org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(DocumentsWriterFlushQueue.java:204)
    at org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:118)
    at org.apache.lucene.index.DocumentsWriterFlushQueue.forcePurge(DocumentsWriterFlushQueue.java:137)
    at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:445)
    at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:565)
    at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2739)
    at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2875)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2855)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2839)
    at com.wps.xnotarneu.inMemory.InMemoryImprovedExample.initializeWithTestData(InMemoryImprovedExample.java:273)
    at com.wps.xnotarneu.inMemory.InMemoryImprovedExample.main(InMemoryImprovedExample.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

Below my methods:

    private static void initializeWithTestData(Directory idx) throws IOException {
        Analyzer a = new StandardAnalyzer(Version.LUCENE_40);

        IndexWriterConfig idxConfig = new IndexWriterConfig(Version.LUCENE_40, a);
        idxConfig.setOpenMode(OpenMode.CREATE_OR_APPEND);

        // Make an writer to create the index
        IndexWriter writer =  new IndexWriter(idx, idxConfig);

        File testDataDirectory = new File(TEST_DATA_DIR_PATH);

        Document newDoc = new Document();
        newDoc.add(new TextField("title", "", Field.Store.YES));
        newDoc.add(new TextField("name", "", Field.Store.YES));
        newDoc.add(new TextField("content", "", Field.Store.YES));

        if (testDataDirectory!=null && testDataDirectory.exists()) {
            Collection<File> listFiles = FileUtils.listFiles(testDataDirectory,
                    FileFilterUtils.fileFileFilter(),
                    null);
            int countOfIndexed = 0;

            for (File file : listFiles) {
                if (file.isFile()) {
                    List<String> lines = FileUtils.readLines(file);
                    String title = "";
                    String name = file.getName();
                    StringBuilder body = new StringBuilder();
                    for (String line: lines) {
                        String trimmedLine = line.trim();
                        if (trimmedLine.length()>0) {
                            if (title.isEmpty()) {
                                title = trimmedLine;
                            } else {
                                body.append(trimmedLine);
                                body.append(" ");
                            }
                        }
                    }
                    try {

                        writer.addDocument(createDocument(newDoc, title, name, body.toString()));

                    } catch(Throwable t) {
                        t.printStackTrace();
                        System.out.println("Error to index the document: " + name + " at position: " + countOfIndexed);
                    }
                    countOfIndexed++;
                }
            }
            System.out.println("Indexed documents: " + countOfIndexed);
        } else {
            // Add some Document objects containing quotes
            writer.addDocument(createDocument(newDoc, "Theodore Roosevelt", "in-memory-" + System.currentTimeMillis(),
                    "It behooves every man to remember that the work of the " +
                            "critic, is of altogether secondary importance, and that, " +
                            "in the end, progress is accomplished by the man who does " +
                            "things."));
            writer.addDocument(createDocument(newDoc, "Friedrich Hayek","in-memory-" + System.currentTimeMillis(),
                    "The case for individual freedom rests largely on the " +
                            "recognition of the inevitable and universal ignorance " +
                            "of all of us concerning a great many of the factors on " +
                            "which the achievements of our ends and welfare depend."));
            writer.addDocument(createDocument(newDoc, "Ayn Rand","in-memory-" + System.currentTimeMillis(),
                    "There is nothing to take a man's freedom away from " +
                            "him, save other men. To be free, a man must be free " +
                            "of his brothers."));
            writer.addDocument(createDocument(newDoc, "Mohandas Gandhi","in-memory-" + System.currentTimeMillis(),
                    "Freedom is not worth having if it does not connote " +
                            "freedom to err."));
            writer.commit();
        }

        // Optimize and close the writer to finish building the index
        writer.close();
    }

    private static Directory getEncryptedIndexDirectory(String password, byte[] salt)
            throws IOException, GeneralSecurityException {
        File encryptedDir = new File(getDirPath(true));
        Directory bdir = FSDirectory.open(encryptedDir);
        DataEncryptor enc = new DataEncryptor("AES/ECB/PKCS5Padding", password, salt, 128, true);
        DataDecryptor dec = new DataDecryptor(password, salt, true);

        //Directory cdir = new TransformedDirectory(bdir, st, rt);
        Directory cdir = new TransformedDirectory(bdir, enc, dec);

        return cdir;
    }

    private static Document createDocument(Document doc, String title, String name, String content) {

        TextField f = (TextField)doc.getField("title");
        f.setStringValue(title);

        f = (TextField)doc.getField("name");
        f.setStringValue(name);

        TextField f1 = (TextField)doc.getField("content");
        f1.setStringValue(content);

        return doc;
    }

Do you have an idea what the exception may be caused by?

Thank you.

Best regards,

Borislav

Original issue reported on code.google.com by b.roussa...@yahoo.com on 2 Oct 2014 at 1:44

GoogleCodeExporter commented 9 years ago
I encountered the same problem with Lucene 4.9 and Lucene Transform 4.9

I found the following ways to avoid/fix the problem:
    1.  Always use MMapDirectory, the exception does not occur
    2.  Modify calls to nested.openInput() in TransformedDirectory.fileLength from: 

    nested.openInput(name, null);

    To:
    nested.openInput(name, IOContext.READONCE);

I'm not sure if (2) is the correct fix as I'm not extremely familiar with 
Lucene details, so there may be side-effects I've yet to find

Original comment by cmatthew...@gmail.com on 23 Oct 2014 at 7:22