laito / cleartk

Automatically exported from code.google.com/p/cleartk
0 stars 0 forks source link

convert CharacterCategoryPatternExtractor to a feature function #390

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
It seems like CharacterCategoryPatternExtractor ought to be a FeatureFunction.  
Is there any reason why it isn't?  The argument for making it a FeatureFunction 
is so that feature extraction isn't repeated and so that we have a consistent 
pattern for feature extraction.  Thoughts?

[Steve]
Agreed that it should be a feature function.

Original issue reported on code.google.com by phi...@ogren.info on 27 Oct 2013 at 5:12

GoogleCodeExporter commented 9 years ago
It occurs to me as I start making this change that it could easily be both a 
feature function and a feature extractor.  Feature functions are nice if you 
are going to use FeatureFunctionExtractor along with a bunch of feature 
functions.  However, it actually makes the timeml code a little more 
complicated because instead of adding a simple 
CharacterCategoryPatternExtractor to the list of extractors you have to create 
a FeatureFunctionExtractor and initialize it with a CoveredTextExtractor and 
the CharacterCategoryPatternFunction.  It doesn't look that pretty really.  So, 
I think it can be both.  The question is what to call it (extractor or 
function)?  

I suppose another possibility is to have a CharacterCategoryPatterUtility class 
(or some such) and have both a CharacterCategoryPatternExtractor  and 
CharacterCategoryPatternFunction.

Thoughts?

Original comment by phi...@ogren.info on 29 Oct 2013 at 4:04

GoogleCodeExporter commented 9 years ago
Yeah, I would have both a CharacterCategoryPatternExtractor and a 
CharacterCategoryPatternFunction and just have them share code somehow, either 
a common superclass or a utility class as you suggested.

Original comment by steven.b...@gmail.com on 29 Oct 2013 at 5:04

GoogleCodeExporter commented 9 years ago
I guess one more alternative would be to have a static method 
CharacterCategoryPatternFunction.asExtractor() that just wraps 
CharacterCategoryPatternFunction in a FeatureFunctionExtractor. This might have 
the benefit that we could supply the same .asExtractor() static method on the 
other FeatureFunctions, making them easier to use without manually creating a 
FeatureFunctionExtractor.

Original comment by steven.b...@gmail.com on 29 Oct 2013 at 5:11

GoogleCodeExporter commented 9 years ago
I had exactly the same thought this morning before I saw this.  

I'm not familiar with the as* naming convention.  Did you just make that up?  

Original comment by phi...@ogren.info on 30 Oct 2013 at 5:19

GoogleCodeExporter commented 9 years ago
I've seen it before, but not sure where. Might be a scala-ism. I'd be fine with 
getExtractor() or createExtractor() or something like that too.

Original comment by steven.b...@gmail.com on 30 Oct 2013 at 1:41

GoogleCodeExporter commented 9 years ago
This issue was closed by revision 517d7cdda9f4.

Original comment by phi...@ogren.info on 31 Oct 2013 at 4:44

GoogleCodeExporter commented 9 years ago
I added two createExtractor methods one that uses a default PatternType and 
another that takes one as a parameter.  For backwards compatibility the 
returned feature extractor is of type NamedFeatureExtractor1 (since that is 
what this class implemented when it was a feature extractor.)  Tests are all 
passing and code is checked in.  

Original comment by phi...@ogren.info on 31 Oct 2013 at 4:45