laito / cleartk

Automatically exported from code.google.com/p/cleartk
0 stars 0 forks source link

Provide example of how to wrap a ClearTK AE into a simple API that any java program can call #405

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
We should provide an example of how to create a ClearTK wrapper that would 
allow one to call a ClearTK-based analysis engine from any java program.   An 
example of this could be used for something like NER in which a java program 
simply wants to call an NER routine with some text and get back some results.  
Here's a sketch of what such an example might look like.  

NER myNer = new NER()
List<MyNerPojo> nerResults = myNer.doNer(myText)

Where doNer() has a signature something like this:

public List<MyNerPojo> doNer(String myText)

The implementation of doNer() would look something like this:

1: AnalysisEngine nerAE = builder.createAggregate;
2: JCas myJcas = nerAE.newJCas();
3:  myJcas.setDocumentText(text);
4:  nerAE.process(myJcas);
5:  for (NamedEntityMention mention : JCasUtil.select(jCas, 
NamedEntityMention.class)) {
6:      //collect whatever you want from each mention into your pojo return type
7:  }

Line 1 makes use of an AggregateBuilder.  See RunNamedEntityChunker.main for a 
nice example of this
Line 2 is an expensive call (~100 ms) so you may want to pass in the JCas 
instance to doNer() or have it be a member variable.  You can reset a JCas 
instance with the reset() method.  Note that this would have synchronization 
implications - i.e. doNer() should be marked as synchronized.    

Some of the above text was copied from a recent post on the cleartk-users list. 

Original issue reported on code.google.com by phi...@ogren.info on 22 Jun 2014 at 8:02