Aggregate with CAS Multiplier & Merger returns wrong result CAS when run in SimplePipeline

GoogleCodeExporter commented 9 years ago

An aggregate with a CAS multiplier, some processing steps, and a CAS merger 
does not return the correct merged CAS when run in SimplePipeline.runPipeline.
(Maybe such a pipeline is not a SimplePipeline anymore :)

First, for the aggregate, OutputsNewCases is false even if for some primitives 
it is true.

Second, it seems that processAndOutputNewCASes instead of process needs to be 
called in such cases.

Original issue reported on code.google.com by torsten....@gmail.com on 5 Jul 2010 at 5:49

GoogleCodeExporter commented 9 years ago

I have not used the CAS multiplier.  I agree that maybe this is outside the 
scope of "simple" pipelines - but I think it makes sense to support this use 
case.  It looks like you might have some insight on how to fix this issue.  We 
would be happy to accept a patch if you have time to put one together.  
Otherwise, this issue will likely have to wait until after we release version 
1.0.0.

Original comment by pvogren@gmail.com on 8 Jul 2010 at 2:53

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

I dig into the code today, but couldn't solve the issue.
I think I will need some help from the uima-user list with this.
Once I have a solution, I will provide a patch.

Original comment by torsten....@gmail.com on 8 Jul 2010 at 9:32

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 18 Mar 2011 at 4:06

Added labels: Type-Defect

GoogleCodeExporter commented 9 years ago

[copy-paste from the users mailing list]

I am not sure, whether this fits to your issue, but this is how I set up my 
aggregate AE:

AnalysisEngineDescription outerAggregate = createAggregateDescription(
    FlowControllerFactory.createFlowControllerDescription(
        FixedFlowController.class, FixedFlowController.PARAM_ACTION_AFTER_CAS_MULTIPLIER, "drop"),
    SOME_AEs (might include other aggregates));

outerAggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputs
NewCASes(true);
runPipeline(readerItem, outerAggregate);

Note, that I use the changed SimplePipeline that wraps everything in yet 
another aggregate. This way, the CAS is dropped inside the aggregate where all 
the processing happens. As long as you do not have simple AEs after that 
aggregate that drops the CAS, it does not matter that the CPE cannot drop it.

-Torsten

Original comment by richard.eckart on 18 Mar 2011 at 4:08

GoogleCodeExporter commented 9 years ago

I gather a solution would be to simply put all AEs into a AAE inside 
runPipeline instead of using JCasIterable. Any comments?

Original comment by richard.eckart on 18 Mar 2011 at 4:10

GoogleCodeExporter commented 9 years ago

That sounds like a plausible solution. We'd definitely need a test case to make 
sure we were fixing the problem though.

Either way, we should definitely get rid of the JCasIterable usage in 
runPipeline. JCasIterable is a convenience collection for allowing code to 
iterate over JCases, but it has to turn some real exceptions into 
RuntimeExceptions to do this. The runPipeline methods don't ever produce the 
JCases, so there's no reason for them to swallow the exceptions too.

Original comment by steven.b...@gmail.com on 23 Mar 2011 at 10:53

GoogleCodeExporter commented 9 years ago

Looking into this in more detail, there are several problems.

1) replacing JCasIterable with an aggreagate analysis engine works only in the 
methods that take AnalysisEngineDescriptions. There is no way to wrap already 
instantiated AnalysisEngines in an aggegate.

2) runPipeline(jcas, ...) assumes that jcas is the input and probably also the 
output (?) - that is - probably one would use those methods to run stuff 
without a CAS consumer. However, if a CAS multiplier is employed, the method 
would need to be changed to return a CasIterator.

While looking into this, I noticed that UIMA seems to provide a way to run a 
CollectionReader as part of an aggregate (see CollectionReaderAdapter). 
Probably we could exploit that to further delegate the pipeline logic to UIMA.

Original comment by richard.eckart on 26 Mar 2011 at 3:55

Added labels: Priority-Low

GoogleCodeExporter commented 9 years ago

These issues are candidates for version 1.3.0.

Original comment by richard.eckart on 7 May 2011 at 5:31

Added labels: Milestone-1.3.0

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 4 Jan 2012 at 10:51

Added labels: Milestone-1.4.0

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 5 Jul 2012 at 4:02

Added labels: Milestone-1.5.0
Removed labels: Milestone-1.4.0

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 7 Jan 2013 at 4:51

Added labels: ASFJira-No

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 25 Aug 2013 at 8:17

Removed labels: Milestone-1.5.0

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 25 Aug 2013 at 8:18

google-code-export / uimafit

Aggregate with CAS Multiplier & Merger returns wrong result CAS when run in SimplePipeline #33