avaneesh23 / galagosearch

Automatically exported from code.google.com/p/galagosearch
BSD 3-Clause "New" or "Revised" License
1 stars 4 forks source link

Unrecognised File Extensions Crash Indexer #15

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
If there is a document with an unrecognised file extension in a collection
directory Galago crashes with a NullPointerException, after printing a
message about skipping it.

Running the command:
> bin/galago build /tmp/ap1.index collections/adhoc_colls/ap1/

2009-04-23 17:18:12.827::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
2009-04-23 17:18:12.828::INFO:  jetty-6.1.5
2009-04-23 17:18:12.843::INFO:  Started SocketConnector@0.0.0.0:40875
Status: http://localhost:40875
Skipping: collections/adhoc_colls/ap1/file_list
Exception in thread "main" java.util.concurrent.ExecutionException: Stage
threw an exception: 
    at
org.galagosearch.tupleflow.execution.JobExecutor$JobExecutionStatus.waitForStage
s(JobExecutor.java:1135)
    at
org.galagosearch.tupleflow.execution.JobExecutor$JobExecutionStatus.run(JobExecu
tor.java:1054)
    at
org.galagosearch.tupleflow.execution.JobExecutor.runWithServer(JobExecutor.java:
1191)
    at
org.galagosearch.tupleflow.execution.JobExecutor.runLocally(JobExecutor.java:121
5)
    at org.galagosearch.core.tools.App.handleBuild(App.java:121)
    at org.galagosearch.core.tools.App.main(App.java:422)
Caused by: java.lang.NullPointerException
    at
org.galagosearch.core.parse.DocumentSource.processFile(DocumentSource.java:124)
    at
org.galagosearch.core.parse.DocumentSource.processDirectory(DocumentSource.java:
139)
    at org.galagosearch.core.parse.DocumentSource.run(DocumentSource.java:150)
    at
org.galagosearch.tupleflow.execution.ThreadedStageExecutor$InstanceRunnable.run(
ThreadedStageExecutor.java:57)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:636)

This occurs in the binary release of Galago 1.01 on Ubuntu 8.10.

The bug appears to be caused in the method processFile of DocumentSource ,
where it prints a message about skipping a file but attempts to index it
anyway.

Original issue reported on code.google.com by tim.g.ar...@gmail.com on 23 Apr 2009 at 7:22

GoogleCodeExporter commented 9 years ago
Thanks for the excellent bug report.  This is now fixed and I've added a unit 
test.

Original comment by trevor.s...@gmail.com on 10 May 2009 at 9:59