ClearTK / cleartk

Machine learning components for Apache UIMA
http://cleartk.github.io/cleartk/
Other
129 stars 58 forks source link

The MutualInformationTest.testMutualInformationFeatureSelection() is failed in Java 1.8.0 #415

Closed bethard closed 8 years ago

bethard commented 9 years ago

Original issue 417 created by ClearTK on 2015-03-24T17:59:52.000Z:

What steps will reproduce the problem? Update to the Java version 1.8.0 run MutualInformationTest.testMutualInformationFeatureSelection()

What is the expected output? Test passes

What do you see instead?

What version of the product are you using? On what operating system? ClearTK 2.0.1 java version "1.8.0_40" OS X Yosemite

Comment: In the above test, the order of 'Bag_Covered:pig' and 'Bag_Covered:wolf' is changed between two version of java. The information of both cases are the same.

bethard commented 9 years ago

Comment #1 originally posted by ClearTK on 2015-03-24T18:26:26.000Z:

Fixes the issue 417.

The problem is because of HashBasedTable used in MutualInformationStats. This causes the order of featureNames in MutualInformationFeatureSelectionExtractor.train() becomes arbitarary. To fix this, I added TreeSet(featureNames) to always have the same ordering.

leebecker commented 8 years ago

Closing this issue. Fix uses a TreeBasedTable instead of a HashBasedTable.