ziqizhang / jate

NEWS: JATE2.0 Beta.11 Released, see details below.
GNU Lesser General Public License v3.0
81 stars 29 forks source link

Jate stops working after couple of corpuses #35

Closed paris0120 closed 6 years ago

paris0120 commented 7 years ago

I have to manually stop the program, delete .lock file and restart the program.

01 Mar 2017 16:50:20 ERROR CoreContainer - Error creating core [GENIA]: Could not load conf for core GENIA: Initiating org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory failed due to: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:104) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:75) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:725) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438) at java.util.concurrent.FutureTask.run(Unknown Source) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.FilterInputStream.read(Unknown Source) at java.io.PushbackInputStream.read(Unknown Source) at java.util.zip.InflaterInputStream.fill(Unknown Source) at java.util.zip.InflaterInputStream.read(Unknown Source) at java.util.zip.ZipInputStream.read(Unknown Source) at java.io.DataInputStream.readFully(Unknown Source) at java.io.DataInputStream.readLong(Unknown Source) at java.io.DataInputStream.readDouble(Unknown Source) at opennlp.tools.ml.model.BinaryFileDataReader.readDouble(BinaryFileDataReader.java:53) at opennlp.tools.ml.model.AbstractModelReader.readDouble(AbstractModelReader.java:75) at opennlp.tools.ml.model.AbstractModelReader.getParameters(AbstractModelReader.java:146) at opennlp.tools.ml.maxent.io.GISModelReader.constructModel(GISModelReader.java:75) at opennlp.tools.ml.model.GenericModelReader.constructModel(GenericModelReader.java:59) at opennlp.tools.ml.model.AbstractModelReader.getModel(AbstractModelReader.java:87) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:35) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:31) at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:328) at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:256) at opennlp.tools.util.model.BaseModel.(BaseModel.java:179) at opennlp.tools.postag.POSModel.(POSModel.java:105) at uk.ac.shef.dcs.jate.nlp.opennlp.POSTaggerOpenNLP.(POSTaggerOpenNLP.java:18) at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)

org.apache.solr.common.SolrException: Could not load conf for core GENIA: Initiating org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory failed due to: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:104) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:75) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:725) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438) at java.util.concurrent.FutureTask.run(Unknown Source) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.FilterInputStream.read(Unknown Source) at java.io.PushbackInputStream.read(Unknown Source) at java.util.zip.InflaterInputStream.fill(Unknown Source) at java.util.zip.InflaterInputStream.read(Unknown Source) at java.util.zip.ZipInputStream.read(Unknown Source) at java.io.DataInputStream.readFully(Unknown Source) at java.io.DataInputStream.readLong(Unknown Source) at java.io.DataInputStream.readDouble(Unknown Source) at opennlp.tools.ml.model.BinaryFileDataReader.readDouble(BinaryFileDataReader.java:53) at opennlp.tools.ml.model.AbstractModelReader.readDouble(AbstractModelReader.java:75) at opennlp.tools.ml.model.AbstractModelReader.getParameters(AbstractModelReader.java:146) at opennlp.tools.ml.maxent.io.GISModelReader.constructModel(GISModelReader.java:75) at opennlp.tools.ml.model.GenericModelReader.constructModel(GenericModelReader.java:59) at opennlp.tools.ml.model.AbstractModelReader.getModel(AbstractModelReader.java:87) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:35) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:31) at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:328) at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:256) at opennlp.tools.util.model.BaseModel.(BaseModel.java:179) at opennlp.tools.postag.POSModel.(POSModel.java:105) at uk.ac.shef.dcs.jate.nlp.opennlp.POSTaggerOpenNLP.(POSTaggerOpenNLP.java:18) at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)

at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:80)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:725)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Caused by: java.lang.IllegalArgumentException: Initiating org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory failed due to: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:104) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:75) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:725) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438) at java.util.concurrent.FutureTask.run(Unknown Source) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.FilterInputStream.read(Unknown Source) at java.io.PushbackInputStream.read(Unknown Source) at java.util.zip.InflaterInputStream.fill(Unknown Source) at java.util.zip.InflaterInputStream.read(Unknown Source) at java.util.zip.ZipInputStream.read(Unknown Source) at java.io.DataInputStream.readFully(Unknown Source) at java.io.DataInputStream.readLong(Unknown Source) at java.io.DataInputStream.readDouble(Unknown Source) at opennlp.tools.ml.model.BinaryFileDataReader.readDouble(BinaryFileDataReader.java:53) at opennlp.tools.ml.model.AbstractModelReader.readDouble(AbstractModelReader.java:75) at opennlp.tools.ml.model.AbstractModelReader.getParameters(AbstractModelReader.java:146) at opennlp.tools.ml.maxent.io.GISModelReader.constructModel(GISModelReader.java:75) at opennlp.tools.ml.model.GenericModelReader.constructModel(GenericModelReader.java:59) at opennlp.tools.ml.model.AbstractModelReader.getModel(AbstractModelReader.java:87) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:35) at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:31) at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:328) at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:256) at opennlp.tools.util.model.BaseModel.(BaseModel.java:179) at opennlp.tools.postag.POSModel.(POSModel.java:105) at uk.ac.shef.dcs.jate.nlp.opennlp.POSTaggerOpenNLP.(POSTaggerOpenNLP.java:18) at sun.reflect.GeneratedConstructorAccessor138.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at uk.ac.shef.dcs.jate.nlp.InstanceCreator.createPOSTagger(InstanceCreator.java:28) at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:40) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:176) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)

at org.apache.lucene.analysis.jate.OpenNLPPOSTaggerFactory.inform(OpenNLPPOSTaggerFactory.java:45)
at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:643)
at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:176)
at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:104)
at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:75)
... 8 more

Wed Mar 01 16:50:26 EST 2017 loading exception data for lemmatiser... Wed Mar 01 16:50:27 EST 2017 loading exception data for lemmatiser... Wed Mar 01 16:50:40 EST 2017 loading done Wed Mar 01 16:50:40 EST 2017 loading done

jerrygaoLondon commented 7 years ago

It looks like that JATE2 has out of memory error which causes Solr process is killed unexpectedly. This is the reason why there will be a lock ('.lock') file. You can try to increase your JVM memory (with the parameters -Xms and -Xmx) to handle large corpus. -Xms specifies the minimum memory; -Xmx the maximum one. You have to specify them when you run your app. Some ATE algorithms in JATE2 (typically like Chisquare, CValue and ATTF) consume lots of memory.

Just to note that it is fine to analyse multiple corpus at the same time given that all the corpus are belong to the same domain and you intend to analysis them as if they are from the same corpus.

ziqizhang commented 6 years ago

I am closing this thread because it is 1 year old and there is no more interactions