sudeep87 / uimafit

Automatically exported from code.google.com/p/uimafit
0 stars 0 forks source link

Simplify TokenFactory methods #4

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
There are an awful lot of parameters to fill in for the TokenFactory
methods.  A lot of the ClearTK tests provide part-of-speech tags but no
stems - should add a method for this case.

Also, if you do not specify part-of-speech tags - but do specify a
part-of-speech feature, then the code does not work correctly.

Original issue reported on code.google.com by pvogren@gmail.com on 1 May 2010 at 2:54

GoogleCodeExporter commented 8 years ago

Original comment by pvogren@gmail.com on 1 May 2010 at 5:30

GoogleCodeExporter commented 8 years ago
ok. I renamed TokenFactory to TokenBuilder and greatly simplified the 
interface.  Now you instantiate a TokenBuilder with the token class and 
sentence class and optionally the feature names for part-of-speech tags and 
stems.  Subsequently you call buildTokens and pass in a JCas, some text, the 
token text (optional), the pos tags (optional), and the stems (optional).  

An instance of TokenBuilder is provided for free if your test inherits from the 
newly added Test_ImplBase (see issue #21).

Anyways, I think the resulting unit tests which now use TokenBuilder are much 
cleaner and I am no longer embarrassed by this interface.  

Original comment by pvogren@gmail.com on 11 Jun 2010 at 4:07