Closed GoogleCodeExporter closed 8 years ago
I meant to mention that for the LREC ClearTK paper I successfully used LIBSVM
for the
exact same task - part-of-speech tagging. Of course, things were much
different back
then - but I don't think the answer is that the data is fundamentally
incompatible
with LIBSVM.
Original comment by pvogren@gmail.com
on 31 Jul 2009 at 11:24
Ok, I've started looking at this. I can confirm that the test, as it stands
now, fails for me, too.
Looking at the svm-train usage instructions (what you get when you call it
without arguments), the default
kernel is RBF. I changed the test to explicitly use a linear kernel (arguments
"-t 0"), and it's working now. Can
you confirm this?
As to why it fails with the default settings, I'm not sure. But maybe it's not
too surprising -- AFAIK RBF
kernels are not commonly used with the typical NLP feature sets (i.e. a large
number of binary features). It
would certainly be nice to have a good example data set that's easily and
predictably separable with an RBF
kernel, but I have to admit that I've never had a good intuition for what kinds
of decision surfaces an RBF
kernel is able to produce.
Original comment by phwetz...@gmail.com
on 1 Aug 2009 at 12:22
Everything is separable with an RBF kernel, as long as you set gamma high
enough. You
can play around with the demo here:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/#GUI
Basically, with a high enough gamma, the RBF kernel will draw a circle around
each
data point. Of course, the fact that everything is separable doesn't mean that
it's a
good separation. ;-)
As far as the ClearTK tests go though, I think it makes sense to explicitly use
a
linear kernel instead of an RBF one. My experience has generally been that
linear and
polynomial kernels work better for NLP tasks.
Original comment by steven.b...@gmail.com
on 1 Aug 2009 at 5:01
ok - I have changed the training args to "-t 0" per Philipp's suggestion for
the unit
tests in question. I will label this as WontFix since it was the tests that
were
wrong - not the framework code.
Original comment by pvogren@gmail.com
on 13 Aug 2009 at 4:06
Original issue reported on code.google.com by
pvogren@gmail.com
on 31 Jul 2009 at 11:20