Open meanna opened 4 months ago
I think you did not download the library files yet. You need to edit the pipeline with the pipeline builder and save it again. Let me know if that helps.
Could you give me an example, what components to add to pipeline?
Sofar this is what I tried.
Collection Reader:
- JCoRe File Reader
Maven artifact: de.julielab:jcore-file-reader:2.3.10
Mandatory Parameters:
InputDirectory: data/files
Analysis Engines:
- JCoRe Sentence Annotator
Maven artifact: de.julielab:jcore-jsbd-ae:2.3.10
Mandatory Parameters:
ModelFilename: de/julielab/jcore/ae/jsbd/model/jsbd-2.0.gz
- JCoRe Token Annotator
Maven artifact: de.julielab:jcore-jtbd-ae:2.3.10
Mandatory Parameters:
ModelFilename: de/julielab/jcore/ae/jtbd/model/jtbd-2.0-biomed.gz
- GazetteerAnnotator, Template Descriptor
Maven artifact: de.julielab:jcore-lingpipe-gazetteer-ae:2.3.10
Mandatory Parameters:
OutputType: <not set>
External Resources:
DictionaryChunkerProvider: <not bound>
After saving the pipeline, I also saw the following in missing_configuration.txt
.
Could not reader "<AAE delegate>"due to a parsing error: Could not parse an AAE delegate specifier: An import could not be resolved. No file with name "de/julielab/jcore/types/jcore-morpho-syntax-types.xml" was found in the class path or data path. (Descriptor: file:/D:/Documents/my_doc/AIIM/repo/smithsearch_me/pipeline2/desc/JCoRe Sentence Annotator.xml)
I think you did not download the library files yet. You need to edit the pipeline with the pipeline builder and save it again. Let me know if that helps.
How should I download the library properly?
Currently, I have de.julielab:jcore-types:2.6.0
as installed package.
The jar file contains jcore- morpho-syntax-types.xml
.
You need to open the existing pipeline. You don't have to add any components but just: Open the pipeline with the Pipeline Builder CLI and select the option to save the pipeline. That should do the trick.
When I load (from https://github.com/JULIELab/smithsearch/tree/main/indexing-pipeline) and save the pipeline, I don't see the lib
folder being created and the local path in aeDescriptions.json
does not change.
For example, I still have
` "file": "D:\\Users\\faessler\\.m2\\repository\\de\\julielab\\jcore-jsbd-ae-medical-german\\2.6.0\\jcore-jsbd-ae-medical-german-2.6.0.jar",`
Any error messages in the console when you scroll up after saving? Which JCoRe repositories did you select?
There was not error message after saving.
You are here: Index/Save Pipeline
Enter the directory to save the pipeline to. [indexing-pipeline]: temp
Storing pipeline. It may take a while to gather all transitive dependencies, please wait...
Saved pipeline to D:\Documents\my_doc\AIIM\repo\smithsearch_me\temp
You are here: Index
Collection Reader:
- JCoRe File Reader
Maven artifact: de.julielab:jcore-file-reader:2.6.1
Mandatory Parameters:
InputDirectory: data/files/jsyncc
Analysis Engines:
- JCoRe Sentence Annotator
Maven artifact: de.julielab:jcore-jsbd-ae-medical-german:2.6.0
Mandatory Parameters:
ModelFilename: de/julielab/jcore/ae/jsbd/model/jsbd-framed.gz
- JCoRe Token Annotator
Maven artifact: de.julielab:jcore-jtbd-ae-medical-german:2.6.0
Mandatory Parameters:
ModelFilename: de/julielab/jcore/ae/jtbd/model/jtbd-framed.gz
- ICD10 Gazetteer
Maven artifact: de.julielab:jcore-lingpipe-gazetteer-ae:2.6.1
Mandatory Parameters:
OutputType: de.julielab.jcore.types.EntityMention
External Resources:
DictionaryChunkerProvider: ConfigurableDictionaryChunkerProvider
Name: ConfigurableDictionaryChunkerProvider
Description: Employs the configurable alternative chunker implementation. This alternative
implementation is primarily meant for approximate matching and should be used in this intent.
It offers text normalization for better dictionary matching and the removal of accents (transliteration).
It is configured in the component descriptor and directly references the dictionary URL.
Implementation: de.julielab.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt
Resource URL: file:resources/ICD10-2022-reversed.dict
UseApproximateMatching: true
CaseSensitive: false
MakeVariants: false
NormalizeText: true
NormalizePlural: false
TransliterateText: false
StopWordFile: /de/julielab/jcore/ae/lingpipegazetteer/stopwords/DE_from_unine.ch
CAS Consumers:
- JCore ElasticSearch Consumer
Maven artifact: de.julielab:jcore-elasticsearch-consumer:2.6.1
Mandatory Parameters:
urls: [http://localhost:9200]
indexName: smith
Which JCoRe repositories did you select?
Not sure if I understand this question correctly.
I opened smithsearch repo in Intellij and there were options to install packages, and I clicked install. I have the following jcore packages installed in my environment.
de.julielab:jcore-elasticsearch-consumer:2.6.2
de.julielab:jcore-types:2.6.0
de.julielab:jcore-utilities:2.6.0
de.julielab:julielab-java-utilities: 1.5.0
There was not error message after saving.
You are here: Index/Save Pipeline Enter the directory to save the pipeline to. [indexing-pipeline]: temp Storing pipeline. It may take a while to gather all transitive dependencies, please wait... Saved pipeline to D:\Documents\my_doc\AIIM\repo\smithsearch_me\temp You are here: Index Collection Reader: - JCoRe File Reader Maven artifact: de.julielab:jcore-file-reader:2.6.1 Mandatory Parameters: InputDirectory: data/files/jsyncc Analysis Engines: - JCoRe Sentence Annotator Maven artifact: de.julielab:jcore-jsbd-ae-medical-german:2.6.0 Mandatory Parameters: ModelFilename: de/julielab/jcore/ae/jsbd/model/jsbd-framed.gz - JCoRe Token Annotator Maven artifact: de.julielab:jcore-jtbd-ae-medical-german:2.6.0 Mandatory Parameters: ModelFilename: de/julielab/jcore/ae/jtbd/model/jtbd-framed.gz - ICD10 Gazetteer Maven artifact: de.julielab:jcore-lingpipe-gazetteer-ae:2.6.1 Mandatory Parameters: OutputType: de.julielab.jcore.types.EntityMention External Resources: DictionaryChunkerProvider: ConfigurableDictionaryChunkerProvider Name: ConfigurableDictionaryChunkerProvider Description: Employs the configurable alternative chunker implementation. This alternative implementation is primarily meant for approximate matching and should be used in this intent. It offers text normalization for better dictionary matching and the removal of accents (transliteration). It is configured in the component descriptor and directly references the dictionary URL. Implementation: de.julielab.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt Resource URL: file:resources/ICD10-2022-reversed.dict UseApproximateMatching: true CaseSensitive: false MakeVariants: false NormalizeText: true NormalizePlural: false TransliterateText: false StopWordFile: /de/julielab/jcore/ae/lingpipegazetteer/stopwords/DE_from_unine.ch CAS Consumers: - JCore ElasticSearch Consumer Maven artifact: de.julielab:jcore-elasticsearch-consumer:2.6.1 Mandatory Parameters: urls: [http://localhost:9200] indexName: smith
This doesn't look so bad. I notice however, that you saved the pipeline into another directory. Perhaps this is why your original files are unchanged.
Could you post the contents of the directory D:\Documents\my_doc\AIIM\repo\smithsearch_me\temp
?
Which JCoRe repositories did you select?
Not sure if I understand this question correctly.
Start the Pipeline Builder CLI, use menu option 10. What's the output there?
I opened smithsearch repo in Intellij and there were options to install packages, and I clicked install. I have the following jcore packages installed in my environment.
de.julielab:jcore-elasticsearch-consumer:2.6.2 de.julielab:jcore-types:2.6.0 de.julielab:jcore-utilities:2.6.0 de.julielab:julielab-java-utilities: 1.5.0
These libraries are required to build the custom indexing code in the src/
folder.
In theory, it should work like this:
indexing-pipeline
directorylib/
directorymvn clean package
in the pipeline directory to also compile the custom code and place the resulting JAR into the `lib/´ directoryCould you post the contents of the directory D:\Documents\my_doc\AIIM\repo\smithsearch_me\temp?
├── aeDescriptions.json
├── aeFlowControllerDescriptions.json
├── ccDescriptions.json
├── ccFlowControllerDescriptions.json
├── cmDescriptions.json
├── crDescriptions.json
├── desc
│ ├── AggregateAnalysisEngine.xml
│ ├── CPE.xml
│ ├── ICD10 Gazetteer.xml
│ ├── JCoRe File Reader.xml
│ ├── JCoRe Sentence Annotator.xml
│ ├── JCoRe Token Annotator.xml
│ ├── JCore ElasticSearch Consumer.xml
│ └── cpeAAE.xml
├── descAll
│ ├── AggregateAnalysisEngine.xml
│ ├── AggregateAnalysisEngineWithIntegratedDelegateDescriptors.xml
│ ├── ICD10 Gazetteer.xml
│ ├── JCoRe File Reader.xml
│ ├── JCoRe Sentence Annotator.xml
│ ├── JCoRe Token Annotator.xml
│ └── JCore ElasticSearch Consumer.xml
└── version-pipelinebuilder.txt
When I save the pipeline in the same directory, I have lib folder and paths are changed to local. I think this problem is solved now.
These libraries are required to build the custom indexing code in the src/ folder.
I don't quite get it. Which src? Do I have to clone https://github.com/JULIELab/jcore-base or other repositories besides smithsearch ? Could you please give me a detailed instruction on how to do this step?
I followed the steps your suggested
- clone the project from GitHub
- enter the indexing-pipeline directory
- open the pipeline using the pipeline builder CLI
- save the pipeline into the original directory to create the lib/ directory
- do mvn clean package in the pipeline directory to also compile the custom code and place the resulting JAR into the `lib/´ directory
- run the pipeline
In the indexing pipeline, I runjava -jar ../jcore-pipeline-modules/bin/jcore-pipeline-runner-base-0.5.3-cli-assembly.jar run.xml
And got the following error
16:45:08.276 [main] INFO o.a.c.b.FluentPropertyBeanIntrospector - Error when creating PropertyDescriptor for public final void org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.
16:45:08.661 [main] INFO d.j.j.p.runner.CPEBootstrapRunner - Found JCoRe CPE runner at ../jcore-pipeline-modules/bin/jcore-pipeline-runner-cpe-0.5.3-jar-with-dependencies.jar
16:45:11.068 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - Creating CPE description from /mnt/d/Documents/my_doc/AIIM/repo/smithsearch_me/indexing-pipeline2/./desc/CPE.xml
16:45:11.282 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - Setting processing unit thread count to 1
16:45:11.282 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - Setting cas pool size to 1
16:45:11.282 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - CPE Checkpoint batch size not set in CPE descriptor. Setting batch size to 500
16:45:11.282 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - Creating CPE...
16:45:11.594 [main] INFO de.julielab.jcore.ae.jsbd.main.SentenceAnnotator - initializing JSBD Annotator ...
Couldn't open cc.mallet.util.MalletLogger resources/logging.properties file.
Perhaps the 'resources' directories weren't copied into the 'class' directory.
Continuing.
16:45:11.686 [main] INFO de.julielab.jcore.ae.jsbd.Abstract2UnitPipe - This sentence splitter model allows sentence splits after all punctuation: false
16:45:11.722 [main] INFO de.julielab.jcore.ae.jtbd.main.TokenAnnotator - [JTBD] initializing JTBD Annotator ...
16:45:11.724 [main] INFO de.julielab.jcore.ae.jtbd.main.TokenAnnotator - Loading model as classpathresource
16:45:11.840 [main] INFO de.julielab.jcore.ae.jtbd.main.TokenAnnotator - initialize() - will tokenize only text covered by sentence annotations
16:45:11.848 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Creating dictionary chunker with dictionary loaded from file:resources/ICD10-2022-reversed.dict
16:45:11.848 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Generate variants: false
16:45:11.849 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Normalize dictionary entries (i.e. completely strip dashes, parenthesis etc): true
16:45:11.849 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Also normalize plural forms to singular: false
16:45:11.849 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Transliterate dictionary entries (i.e. transform accented characters to their base forms): false
16:45:11.849 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Case sensitive: false
16:45:11.849 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Use approximate matching: true
16:45:11.852 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - readDictionary() - adding entries from /de/julielab/jcore/ae/lingpipegazetteer/stopwords/DE_from_unine.ch to dictionary...
16:45:11.854 [main] INFO d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - readDictionary() - adding entries from file:resources/ICD10-2022-reversed.dict to dictionary...
16:45:12.012 [main] ERROR d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - readDictionary() - wrong format of line: medicine E1
16:45:12.034 [main] ERROR d.j.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt - Exception while creating chunker instance from dictionary file file:resources/ICD10-2022-reversed.dict with stopwords from /de/julielab/jcore/ae/lingpipegazetteer/stopwords/DE_from_unine.ch
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at de.julielab.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt.readDictionary(ConfigurableChunkerProviderImplAlt.java:247)
at de.julielab.jcore.ae.lingpipegazetteer.chunking.ConfigurableChunkerProviderImplAlt.load(ConfigurableChunkerProviderImplAlt.java:127)
at org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:753)
at org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:597)
at org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:210)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:136)
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:407)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:256)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:435)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:379)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:192)
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:407)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:256)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:435)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:379)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:192)
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:331)
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:448)
at org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1071)
at org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:550)
at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:255)
at org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:128)
at org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
at org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
at org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
at de.julielab.jcore.pipeline.runner.cpe.CPERunner.createCPE(CPERunner.java:171)
at de.julielab.jcore.pipeline.runner.cpe.CPERunner.runCPE(CPERunner.java:221)
at de.julielab.jcore.pipeline.runner.cpe.CPERunner.process(CPERunner.java:205)
at de.julielab.jcore.pipeline.runner.cpe.CPERunner.main(CPERunner.java:66)
16:45:12.048 [main] INFO de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator - calls to initialize: 0
16:45:12.048 [main] INFO de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator - initialize() - initializing GazetteerAnnotator...
16:45:12.048 [main] INFO de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator - Check for acronyms (found dictionary entries that are abbreviations are only accepted if their long form is an abbreviation of the same type, too): true
16:45:12.048 [main] INFO de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator - Normalize CAS document text (i.e. do stemming and remove possessive 's): true
16:45:12.049 [main] INFO de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator - Transliterate CAS document text (i.e. transform accented characters to their base forms): false
16:45:12.059 [main] INFO de.julielab.jcore.consumer.es.AbstractCasToJsonConsumer - FilterBoards: null
16:45:12.059 [main] INFO de.julielab.jcore.consumer.es.AbstractCasToJsonConsumer - FieldGenerators: [de.julielab.smithsearch.index.SmithSearchFieldGenerator]
16:45:12.059 [main] INFO de.julielab.jcore.consumer.es.AbstractCasToJsonConsumer - DocumentGenerators: []
16:45:12.059 [main] INFO de.julielab.jcore.consumer.es.AbstractCasToJsonConsumer - IdField: null
16:45:12.059 [main] INFO de.julielab.jcore.consumer.es.AbstractCasToJsonConsumer - IdPrefix: null
16:45:12.287 [main] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - urls: [http://localhost:9200]
16:45:12.287 [main] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - indexName: smith
16:45:12.287 [main] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - type: null
16:45:12.287 [main] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - deleteDocumentsBeforeIndexing: false
16:45:12.287 [main] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - documentIdField: null
16:45:12.289 [main] INFO de.julielab.jcore.pipeline.runner.cpe.CPERunner - Start processing ..
16:45:12.400 [CPMEngine Thread] INFO de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - CPE Initialization complete
Jul 17, 2024 4:45:12 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(445)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 more
Jul 17, 2024 4:45:12 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(279)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 more
Jul 17, 2024 4:45:12 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(279)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 more
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 more
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process
SEVERE: The container CPE AAE returned the following error message: Annotator processing failed. (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit maybeLogSevereException(2509)
SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message: Annotator processing failed.
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 more
16:45:12.604 [[Procesing Pipeline#1 Thread]::] ERROR de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - Exception occurred while processing
document with ID jsyncc: [org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed. ] Writing error message to pipeline-error-jsyncc-Wed Jul 17 16:45:12 CEST 2024.err
16:45:12.605 [[Procesing Pipeline#1 Thread]::] ERROR de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - Components failed: [Process]
16:45:12.605 [[Procesing Pipeline#1 Thread]::] ERROR de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - Error message: failed
16:45:12.605 [[Procesing Pipeline#1 Thread]::] ERROR de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - Process trace:
16:45:12.605 [[Procesing Pipeline#1 Thread]::] ERROR de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - Last exception:
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:427)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.innerCall(PrimitiveAnalysisEngine_impl.java:329)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:321)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:271)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:903)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Caused by: java.lang.IllegalStateException: The actual gazetteer object is null. Check previous log messages pointing to the error (most probably the dictionary file could not be found).
at de.julielab.jcore.ae.lingpipegazetteer.uima.GazetteerAnnotator.process(GazetteerAnnotator.java:309)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:411)
... 13 common frames omitted
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.container.ProcessingContainer_Impl process
SEVERE: The CPM stopped because the configured error threshold 0 was exceeded. (Thread Name: [Procesing Pipeline#1 Thread]::) Component Name: CPE AAE
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process
SEVERE: The CPM is terminating. The current component is CPE AAE. (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process
WARNING: The CPM cannot be stopped by force. The current container is CPE AAE. (Thread Name: [Procesing Pipeline#1 Thread]::) Reason: The CAS processor CPE AAE is configured to stop the CPM when excessive errors are encountered. (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit maybeLogSevereException(2509)
SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message:
org.apache.uima.collection.base_cpm.AbortCPMException:
at org.apache.uima.collection.impl.cpm.container.ProcessingContainer_Impl.incrementCasProcessorErrors(ProcessingContainer_Impl.java:822)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:1047)
at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:576)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process
INFO: The collection reader thread state is: 1004 (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process
INFO: The CPM processing unit is 0 and processing state 2003. (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process
INFO: The application stopped the CPM. (Thread Name: [Procesing Pipeline#1 Thread]::)
Jul 17, 2024 4:45:12 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process
INFO: The CPM engine is stopping. An end-of-file token is added to the worker queue. (Thread Name: [Procesing Pipeline#1 Thread]::) Forced stop: true
16:45:12.618 [CPMEngine Thread] INFO de.julielab.jcore.consumer.es.ElasticSearchConsumer - Collection complete.
16:45:12.619 [BaseCPMImpl-Thread] INFO de.julielab.jcore.pipeline.runner.cpe.StatusCallbackListener - The CPE has been aborted by the framework. The JVM is forcibly quit to avoid the application getting stuck on some threads that could not be stopped.
16:45:13.619 [main] INFO d.j.j.p.runner.CPEBootstrapRunner - Pipeline run completed.
Exception in thread "main" java.lang.RuntimeException: Pipeline runner process exited with status 1
at de.julielab.jcore.pipeline.runner.CPEBootstrapRunner.runPipeline(CPEBootstrapRunner.java:92)
at de.julielab.jcore.pipeline.runner.services.PipelineRunnerService.runPipeline(PipelineRunnerService.java:44)
at de.julielab.jcore.pipeline.runner.services.PipelineRunnerService.runPipeline(PipelineRunnerService.java:57)
at de.julielab.jcore.pipeline.runner.application.PipelineRunnerCLI.run(PipelineRunnerCLI.java:45)
at de.julielab.jcore.pipeline.runner.application.PipelineRunnerCLI.main(PipelineRunnerCLI.java:34)
Hi,
I'm trying to run the indexing pipeline, following your instructions.
In the indexing pipeline repo (containing desc, etc.), I run
java -jar ../jcore-pipeline-modules/bin/jcore-pipeline-runner-base-0.5.3-cli-assembly.jar run.xml
and got the following errors.
No file with name
de/julielab/jcore/types/jcore- morpho-syntax-types.xml" was found in the class path or data path.
Jcore is installed as maven package and contain
types/jcore- morpho-syntax-types.xml
This is the full log.
Could please help me?