dnmilne / wikipediaminer

An open source toolkit for mining Wikipedia
130 stars 62 forks source link

The Label compare model for English is broken #22

Open hoangyenan opened 10 years ago

hoangyenan commented 10 years ago

the model "labelCompare_en_In.model" is NOT working and we I tried to create a new LabelComparer (wikipedia, artComparer), it always returns error:

Exception in thread "main" java.lang.ClassNotFoundException: org.dmilne.weka.wrapper.TypedAttribute at java.net.URLClassLoader$1.run(URLClassLoader.java:372) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:360) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:340) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at weka.wrapper.Decider.load(Decider.java:169) at org.wikipedia.miner.comparison.LabelComparer.loadComparisonClassifier(Unknown Source) at org.wikipedia.miner.comparison.LabelComparer.(Unknown Source)

When I replace the English model by the Spanish model, my code works without error. Please kindly update another English model or give any instruction on training new model for the English Label comparer.

hoangyenan commented 10 years ago

I found a working model at this link: https://github.com/chauff/wm/blob/master/models/compare/labelCompare_en_In.model

Concerning the date uploaded the model, it's relatively new, but I'm not sure the content. Hope it helps. An

Neuw84 commented 10 years ago

If you use the GitHub version (mavenized one) you need to retrain the models as the Weka libraries are updated. However if you use the older ones you will be ok (I do not know what the changes are)