anjalisangwan / cleartk

Automatically exported from code.google.com/p/cleartk
0 stars 0 forks source link

The MultiClassLIBSVM* machinery is not working #103

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I set up an experiment to run and tried using the MultiClassLIBSVM* classes
for the learner/classifier.  It didn't work at all so I started digging
into figure out why.  I started by writing some unit tests that exhibit the
behavior.  I have created 2 unit tests that fail that I beleive should not
fail:

org.cleartk.example.pos.ExamplePosClassifierTest.testLibsvm()
org.cleartk.example.pos.NonSequentialExamplePOSAnnotationHandlerTest.testLibsvm(
)

The first test uses the LIBSVM library as sequential classifier using the
ViterbiClassifier/DataWriter.  To eliminate the Viterbi wrapper as the
cause, I created the second test which uses the LIBSVM classifier directly
as a non-sequential classifier.  For this I rewrote the annotation handler
which is in the same directory.  You will see that both of these tests run
the exact same test on many different learners and that the others pass
without failing.  It is also curious to compare the output training data
files from these tests (remove the @After annotation to preserve the
training data and results file) for SVMLight and LIBSVM because they
contain identical data.  

What is also puzzling is that the tests in
org.cleartk.classifier.libsvm.RunLIBSVMTests.  Looking at the test
testMultiClassLIBSVM I wondered if it was passing because it was bypassing
the ClassifierAnnotator and DataWriterAnnotator.  So, I created another
test that is essentially the same except that the two UIMA components are
used.  This also works.  

So, I don't know what is going wrong.  I see the second test mentioned in
the last paragraph and the NonSequentialExamplePOSAnnotationHandlerTest as
essentially the same and can't think of why one works and the other doesn't.  

Original issue reported on code.google.com by pvogren@gmail.com on 31 Jul 2009 at 11:20

GoogleCodeExporter commented 8 years ago
I meant to mention that for the LREC ClearTK paper I successfully used LIBSVM 
for the
exact same task - part-of-speech tagging.  Of course, things were much 
different back
then - but I don't think the answer is that the data is fundamentally 
incompatible
with LIBSVM.  

Original comment by pvogren@gmail.com on 31 Jul 2009 at 11:24

GoogleCodeExporter commented 8 years ago
Ok, I've started looking at this. I can confirm that the test, as it stands 
now, fails for me, too.

Looking at the svm-train usage instructions (what you get when you call it 
without arguments), the default 
kernel is RBF. I changed the test to explicitly use a linear kernel (arguments 
"-t 0"), and it's working now. Can 
you confirm this?

As to why it fails with the default settings, I'm not sure. But maybe it's not 
too surprising -- AFAIK RBF 
kernels are not commonly used with the typical NLP feature sets (i.e. a large 
number of binary features). It 
would certainly be nice to have a good example data set that's easily and 
predictably separable with an RBF 
kernel, but I have to admit that I've never had a good intuition for what kinds 
of decision surfaces an RBF 
kernel is able to produce.

Original comment by phwetz...@gmail.com on 1 Aug 2009 at 12:22

GoogleCodeExporter commented 8 years ago
Everything is separable with an RBF kernel, as long as you set gamma high 
enough. You
can play around with the demo here:

http://www.csie.ntu.edu.tw/~cjlin/libsvm/#GUI

Basically, with a high enough gamma, the RBF kernel will draw a circle around 
each
data point. Of course, the fact that everything is separable doesn't mean that 
it's a
good separation. ;-)

As far as the ClearTK tests go though, I think it makes sense to explicitly use 
a
linear kernel instead of an RBF one. My experience has generally been that 
linear and
polynomial kernels work better for NLP tasks.

Original comment by steven.b...@gmail.com on 1 Aug 2009 at 5:01

GoogleCodeExporter commented 8 years ago
ok - I have changed the training args to "-t 0" per Philipp's suggestion for 
the unit
tests in question.  I will label this as WontFix since it was the tests that 
were
wrong - not the framework code.

Original comment by pvogren@gmail.com on 13 Aug 2009 at 4:06