Open GoogleCodeExporter opened 9 years ago
[deleted comment]
I've experimented with different values of epsilon (confidence) as well as the
initial variance, but the updates remain small. Further, when these parameters
deviate too much from the current default (which I set through experimentation
on
three different tasks), results in poor performance. We could add a multiplier
to the
update, but this would likely hurt performance as well (part of the the reason
CW
works so well is that the updates are conservative in the first place). One
option to
consider is having CW override the default temperature, or have the default
temperature be low (MIRA may have similar temperature problems in certain
settings).
A final possibility to consider would be to learn the temperature from the data
in a
final step right before inference is performed.
Original comment by thebiasedestimator@gmail.com
on 23 Nov 2009 at 3:32
Along the same lines, I'm adding a box constraint to MIRA since constraints
that are
not separable (due to an impoverished feature space) will results in huge
updates to
the parameters (resulting in infinite weight vectors)
Original comment by thebiasedestimator@gmail.com
on 23 Nov 2009 at 3:44
i meant eta (not epsilon) ^^
Original comment by thebiasedestimator@gmail.com
on 23 Nov 2009 at 4:11
[deleted comment]
OK I have a possible solution here by re-ordering some operations. Rather than
compute the multiplier on the diagonal approximation (resulting in loss of
update-mass), first we'll use the diagonal matrix to 'bend' the gradient, then
compute a multiplier (this way we account for the weight that was projected out
of
the full covariance matrix).
Initial testing indicates that this may solve problem of ultra-conservative
updates
and also achieves better generalization (empirically on coreference) than any
method
to date. This will be checked-in (as a separate CW algorithm) after further
testing.
Original comment by thebiasedestimator@gmail.com
on 17 Apr 2010 at 4:49
Original issue reported on code.google.com by
andrew.k.mccallum
on 19 Nov 2009 at 9:21