OpenSextant / Xponents

Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
Apache License 2.0
44 stars 7 forks source link

Refactor resource loading, again #19

Closed mubaldino closed 7 years ago

mubaldino commented 7 years ago

Resource files must be available to SolrResourceLoader from ./lib in order to load into core.

  1. TagFilter -- loading files for GazetteerUpdateProcessorFactory is not necessary. Only basic items are needed.
  2. Instead of using items in optional JARs (e.g., kuromiji analyzer) use locally available ./conf/lang/* files (/lang/stopwords_ja.txt for example )
2017-01-06 23:11:08,266 ERROR [coreLoadExecutor-5-thread-1] org.opensextant.extractors.geo.GazetteerUpdateProcessorFactory: Init failure
java.io.IOException: No such stop filter file /org/apache/lucene/analysis/ja/stopwords.txt
    at org.opensextant.extractors.geo.TagFilter.loadLanguageStopwords(TagFilter.java:87)
    at org.opensextant.extractors.geo.TagFilter.<init>(TagFilter.java:72)
    at org.opensextant.extractors.geo.GazetteerUpdateProcessorFactory.init(GazetteerUpdateProcessorFactory.java:84)
    at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:611)
    at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2268)
    at org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:119)
    at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:609)
mubaldino commented 7 years ago

The Solr setup produces a single JAR, xponents-gazeteer-meta.jar (ant target "gaz-meta")

This is referenced in the full distribution script to build, ant -f ./script/dist.xml build