WIP: added adagrad and unit test

yj14n9xyz commented 8 years ago

@christos-c @danyaljj Please review this PR. Currently AdaGrad uses hinge loss as loss function. I will add lms loss function after adding algo tests.

christos-c commented 8 years ago

@yimingjiang this looks good to me

danyaljj commented 8 years ago

Yiming what happened to the cs446 dataset you were talking about?

yj14n9xyz commented 8 years ago

@danyaljj CS 446 HW3 data set would serve as the data set to run all existing learning algorithms in LBJava. I am still working on the translation from MATLAB to Java for data generation.

For the unit tests for AdaGrad, I only verified the correctness of the update rule. There is no data set involved in unit tests.

danyaljj commented 8 years ago

I am fine with this; but we really need better tests. The least is making sure that classifiers do overfit on small set of data. Say you randomly generate 10 instances, and train for 30 iterations. Your classifier should be able to predict correct at least on 9 of those after seeing them.

(This is addition to the the Dan suggested, testing on the CS446 data)

yj14n9xyz commented 8 years ago

@danyaljj I can add the overfitting tests later. However, both SGD and AdaGrad are currently only for regression. TestReal can only give an RMS error.

1) How are we going to set the boundary to say AdaGrad works fine, based on the RMS error?

2) Or alternatively, inherit the current AdaGrad and create a discrete output version of AdaGrad. This might be necessary since we are going to run algorithm tests. It's easier to compare to other algorithms if they are all doing classification.

3) How many features and how many examples are considered to be a small data set?

danyaljj commented 8 years ago

for regressions, the trained models should give close enough predictions to gold values (like the absolute difference between the prediction and gold value should be less than a threshold value.)
I think we should be able to use adagrad for classification as well; (right @christos-c @danr-ccg ? ) what are the limitations for having adagrad for classification as well?
in a unit test, 10 examples, each with 2 features should be fine.

danr-ccg commented 8 years ago

Yes, we should be able to use adagrad also for classification.

By the way – one interesting test that should be easy to run is to train the pos tagger with all the algorithms we have now in LBJava.

Dan

From: Daniel Khashabi [mailto:notifications@github.com] Sent: Thursday, January 28, 2016 9:13 PM To: IllinoisCogComp/lbjava Cc: Roth, Dan Subject: Re: [lbjava] WIP: added adagrad and unit test (#26)

· for regressions, the trained models should give close enough predictions to gold values (like the absolute difference between the prediction and gold value should be less than a threshold value.)

· I think we should be able to use adagrad for classification as well; (right @christos-chttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_christos-2Dc&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=kHq9fqNmXKgaJk5nM90aeq7Lb6p6-MKskWTEWwL4IBA&e= @danr-ccghttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_danr-2Dccg&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=9_H-UWjmOlzxsJASwcZ9J55YCgpjj4g60A0qvT2hYtc&e= ? ) what are the limitations for having adagrad for classification as well?

· in a unit test, 10 examples, each with 2 features should be fine.

— Reply to this email directly or view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IllinoisCogComp_lbjava_pull_26-23issuecomment-2D176548215&d=BQMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=EoEIPOcXriNKNTRQxPa5uv7hPypxUE6tE_RlGOluh-c&m=1nHSocxQLq48BpuLVPl1FQtjOD82H21uDtw2Ctmsnmo&s=ne8kUhXAgfPntEyuZawgNkXdN7DsTICIZEox9I_P9U4&e=.

yj14n9xyz commented 8 years ago

@danyaljj @christos-c Hi Daniel, I have added 2 overfitting unit tests. The first one is a data set that I've drawn out on paper and verified that the data set is linearly separable. The second one is a data set randomly generated and I have made this data set linearly separable as well. Steps are well documented in the comments in the unit test class. Please review this update.

danyaljj commented 8 years ago

I will merge this tmrw if no one complains.

yj14n9xyz commented 8 years ago

@danyaljj Thanks! I will make another PR to include regression example once this PR is merged.

danyaljj commented 8 years ago

Sounds good, @yimingjiang! Merging!

CogComp / lbjava

WIP: added adagrad and unit test #26