Closed akhudek closed 11 years ago
Hi @akhudek; a few questions:
d+1
when there is a bias? What happens when you don't? Damn...You're right, it's better just to filter unseen data in this case. Too much "word vector representation" on the brain where they have a vector for unseen words. I'll change this.
For bias, it turns out you also have to manually add a bias feature to every instance with index (inc (count diminsions)). Bias wasn't working at all, though I've verified that it now does.
On Thursday, 23 May, 2013 at 1:54 PM, Kevin Lynagh wrote:
Hi @akhudek (https://github.com/akhudek); a few questions: What's the benefit of keeping an extra dimension around for model evaluation time? If you're going to classify or regress with a given model, you should clean your data to work with the assumptions of that model, no?
Re: bias, does liblinear require that you set the dimension to d+1 when there is a bias? What happens when you don't? Damn...— Reply to this email directly or view it on GitHub (https://github.com/lynaghk/clj-liblinear/pull/4#issuecomment-18360310).
Ok, I've reverted having feature-nodes just discard features not in feature maps (sets already worked this way). I've also updated the predict function to add bias features to instances.
Finally, I've added a simple accuracy output for crossfold that is identical to what liblinear's command line code returns. The target array is still returned for implementing problem specific accuracy measures.
Hey Kevin, I understand you're probably busy, is there anything I can help with in regard to these changes? Would be wonderful to have them in the main lib.
Pinging me was all the help I needed---I just lost track of this pull request, sorry about that = ) Tested and merged now. I'll push a 0.1.0 release to Clojars too.
Hey Kevin,
Any interest in these changes? The unseen feature support reserves 1 for unseen features. During prediction, any new features get assigned this id. The cross-fold is fine, though I haven't added any code to compute accuracy values as my own code is only for two classes (e.g. not sufficiently general).
The bias support was missing the adjustment to n when the bias is active. You might also want to consider changing the default for the bias to 1 even though it differs from liblinear's defaults. It is apparently rare to need no bias in practice and disabling it might harm accuracy.