apache / uima-uimacpp

C++ support for Apache UIMA
https://uima.apache.org/
Apache License 2.0
15 stars 19 forks source link

Support Aggregate Engines #6

Open DrDub opened 9 months ago

DrDub commented 9 months ago

Is your feature request related to a problem? Please describe. The UIMA framework has engines composed by primitive engines (annotators) or aggregate engines. At present, the C++ version of the framework cannot handle aggregate engines, only primitives.

An example of a primitive annotator descriptor is the SimpleTextSegmenter.xml. It refers to the annotator itself, SimpleTextSegmenter.cpp.

The aggregate descriptors are discussed in the Apache UIMA Reference. An example descriptor from the Java framework is the NamesAndGovernmentOfficials_TAE.xml.

The Java UIMA aggregate analysis engine implementation involves the class AggregateAnalysisEngine_impl.java and many others.

Describe the solution you'd like The UIMACPP framework should be able to load and execute Aggregate Engines in XML format composed of other aggregate engines or primitive engines implemented in C++.

This includes parsing the XML descriptors and routing the annotations (as part of the Common Annotation Structure, or CAS) from the different annotators. Note that that aggregators shield annotators based on the input and output annotations present in their descriptors.

Describe alternatives you've considered Using UIMA-AS it was possible to interoperate between Java and C++, but the UIMA-AS framework has been retired.

Additional context This has been discussed as one of the main roadblocks in using the C++ version of the framework by its users: https://lists.apache.org/thread/f1r3sghgn2oqhvzz27y26zg6j3olv8qq

Tasks

ShaiviAgarwal2 commented 7 months ago

@DrDub Hi, Would like to work on this issue!!

DrDub commented 7 months ago

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

ShaiviAgarwal2 commented 7 months ago

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

Sent you the email. Could you please check it!!

ShaiviAgarwal2 commented 7 months ago

Hi @ShaiviAgarwal2, do you mind checking whether the instructions in the new readme at #15 work? I'd really appreciate it.

Also, if you're considering applying to the GSoC please send me an email to drdub@apache.org to further discuss.

@DrDub I checked the instructions mentioned by you in the new readme at #15. It works fine :)

mac-op commented 5 months ago

Hi, I'll be working on this issue

mac-op commented 4 months ago

We have identified what has been already been implemented (in internal_aggregate_engine.cpp, annotator_mgr.cpp and related) as well as what's currently missing (eg. when a delegate is a CAS Multiplier the CASIter for Aggregates does not return children CASes).

I will create test cases to examine the capabilities of the XML Parser to see if it conforms to the entire spec and then move on to the Aggregate Engine functionalities.