Closed yj14n9xyz closed 8 years ago
@yimingjiang this looks good to me
Yiming what happened to the cs446 dataset you were talking about?
@danyaljj CS 446 HW3 data set would serve as the data set to run all existing learning algorithms in LBJava. I am still working on the translation from MATLAB to Java for data generation.
For the unit tests for AdaGrad, I only verified the correctness of the update rule. There is no data set involved in unit tests.
I am fine with this; but we really need better tests. The least is making sure that classifiers do overfit on small set of data. Say you randomly generate 10 instances, and train for 30 iterations. Your classifier should be able to predict correct at least on 9 of those after seeing them.
(This is addition to the the Dan suggested, testing on the CS446 data)
@danyaljj I can add the overfitting tests later. However, both SGD
and AdaGrad
are currently only for regression. TestReal
can only give an RMS error.
1) How are we going to set the boundary to say AdaGrad
works fine, based on the RMS error?
2) Or alternatively, inherit the current AdaGrad
and create a discrete
output version of AdaGrad
. This might be necessary since we are going to run algorithm tests. It's easier to compare to other algorithms if they are all doing classification.
3) How many features and how many examples are considered to be a small data set?
Yes, we should be able to use adagrad also for classification.
By the way – one interesting test that should be easy to run is to train the pos tagger with all the algorithms we have now in LBJava.
Dan
From: Daniel Khashabi [mailto:notifications@github.com] Sent: Thursday, January 28, 2016 9:13 PM To: IllinoisCogComp/lbjava Cc: Roth, Dan Subject: Re: [lbjava] WIP: added adagrad and unit test (#26)
· for regressions, the trained models should give close enough predictions to gold values (like the absolute difference between the prediction and gold value should be less than a threshold value.)
· I think we should be able to use adagrad for classification as well; (right @christos-chttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_christos-2Dc&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=kHq9fqNmXKgaJk5nM90aeq7Lb6p6-MKskWTEWwL4IBA&e= @danr-ccghttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_danr-2Dccg&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=9_H-UWjmOlzxsJASwcZ9J55YCgpjj4g60A0qvT2hYtc&e= ? ) what are the limitations for having adagrad for classification as well?
· in a unit test, 10 examples, each with 2 features should be fine.
— Reply to this email directly or view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IllinoisCogComp_lbjava_pull_26-23issuecomment-2D176548215&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=ne8kUhXAgfPntEyuZawgNkXdN7DsTICIZEox9I_P9U4&e=.
@danyaljj @christos-c Hi Daniel, I have added 2 overfitting unit tests. The first one is a data set that I've drawn out on paper and verified that the data set is linearly separable. The second one is a data set randomly generated and I have made this data set linearly separable as well. Steps are well documented in the comments in the unit test class. Please review this update.
I will merge this tmrw if no one complains.
@danyaljj Thanks! I will make another PR to include regression example once this PR is merged.
Sounds good, @yimingjiang! Merging!
@christos-c @danyaljj Please review this PR. Currently AdaGrad uses hinge loss as loss function. I will add lms loss function after adding algo tests.