Closed azazali30 closed 1 year ago
Are you actually using result specifications in your setup?
No we are not using it not sure how its being used internally
Are you declaring any output capabilities in your XML descriptors or using uimaFIT annotations?
Are you calling the process
method or a similar method of a UIMA component from multiple concurrent threads?
below is a jist of main code i have extracted from my code base
org.apache.uima.util.JCasPool jCasPool = new JCasPool(poolSize, aae)
List
for(String text : textsArray) { try { //update jcas with text JCas jcas = jCasPool.getJCas() engine.process(jCas) } finally { jCasPool.releaseJCas(jCas) }
}``
Ok, but is this code called from multiple threads? Note that UIMA components are not expected to be thread-safe. When UIMA parallelizes, it creates multiple instances of a component - one for each of the parallel threads. A component may declare that it is not parallelizable (e.g. writers or components with static fields), then UIMA would not parallelize the component at all and only use a single single-threaded instance of this component.
Are you trying to share a component across multiple concurrent threads?
Ok, but is this code called from multiple threads? Note that UIMA components are not expected to be thread-safe. When UIMA parallelizes, it creates multiple instances of a component - one for each of the parallel threads. A component may declare that it is not parallelizable (e.g. writers or components with static fields), then UIMA would not parallelize the component at all and only use a single single-threaded instance of this component.
Are you trying to share a component across multiple concurrent threads?
we are caching the AnalysisEngine engine = org.apache.uima.fit.factory.AnalysisEngineFactory.createEngine(aaeDesc); so every thread will be using same instance of AnalysisEngine . Is that fine
If every thread is using the same instance of the analysis engine, then you are sharing that instance across threads. This is not supported. Every thread must have its own instance.
@reckart i wonder why JcasPool doc says we can use this pool when there is a need of multiple CASes to be processed simultaneously. And if you see JcasPool has a constructor which accepts Analysis Engine as parameter , this means it will create these jcas instances using same AE. Can you help me understand why this is not contradicting with your statement thanks.
The creation of a new CAS can be an expensive process. Thus, instead of creating a new CAS object for every document, it can be sensible to maintain a pool of CAS objects which are reused while processing a batch of documents.
The CAS pool needs to know information like the type system, index definitions, etc. which can be obtained from an analysis engine - it does not need the engine itself. The constructor that takes an engine is a convenience constructor. The relevant one is org.apache.uima.util.JCasPool.JCasPool(int, ProcessingResourceMetaData)
which only considers the configuration, not the actual engine.
Describe the bug we are running analysis using JcasPool which at a time can have 60 Jcas objects available. After upgrade to UIMA 3.4.1 we started seeing this NullPointerException in ResultSpecification_impl.intersect FYI:
Please complete the following information:
Additional context Add any other context about the problem here.