unit test trainable feature extractors

mkabbasi / cleartk

Automatically exported from code.google.com/p/cleartk

0 stars 0 forks source link

unit test trainable feature extractors #378

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago

The trainable feature extractors should be unit tested.  I'm thinking I will do 
this to help prepare me for writing up the feature extraction documentation 
required by issue #3.

Original issue reported on code.google.com by phi...@ogren.info on 23 Jul 2013 at 5:33

GoogleCodeExporter commented 8 years ago

Original comment by phi...@ogren.info on 9 Dec 2013 at 4:21

Changed state: Started
Added labels: Component-ml

GoogleCodeExporter commented 8 years ago

I added TfidfExtractorTest.  This should serve as a template for testing the 
other trainable extractors.  It should go pretty quickly.

Original comment by phi...@ogren.info on 9 Dec 2013 at 5:02

GoogleCodeExporter commented 8 years ago

I added MinMaxNormalizationExtractorTest which has some very basic edge cases 
all of which were not handled at all.  So, I have made my best guess at what 
the behavior should be:
- if the feature has never been seen, don't throw an NPE just return 0.5
- if the min == max, then return 0.5 if value == min otherwise return 0 or 1
- if the value is a new minimum then return 0 (instead of a negative number)
- if the value is a new max, then return 1 (instead of something larger)

Original comment by phi...@ogren.info on 12 Apr 2014 at 11:20

GoogleCodeExporter commented 8 years ago

Ok - I added a ZeroMeanUnitStddevExtractorTest.  Please see discussion of fixes 
to the corresponding ZeroMeanUnitStddevExtractor in Issue #399.

Original comment by phi...@ogren.info on 13 Apr 2014 at 3:09

GoogleCodeExporter commented 8 years ago

I added added basic unit test for CosineSimilarity which had a bug in it - the 
magnitude was not summing the squares of the values appropriately.  Fixed it 
with a plus sign!  Getting closer....

Original comment by phi...@ogren.info on 13 Apr 2014 at 4:40

GoogleCodeExporter commented 8 years ago

I added a basic unit test for CentroidTfidfSimilarityExtractor and found a 
problem with the computeCentroid method which I fixed.  There's now a basic 
unit test for each of the trainable extractors and so I'm going to close this 
issue.  The tests could be more complete esp. with respect to edge cases - but 
that's all I can do for now.

Original comment by phi...@ogren.info on 26 Apr 2014 at 6:21

Changed state: Fixed