comparison with SGDClassifier

ejlb / pegasos

An sklearn-like python package for pegasos models

Apache License 2.0

46 stars 17 forks source link

comparison with SGDClassifier #1

Open amueller opened 11 years ago

amueller commented 11 years ago

Hey. Did you compare with SGDClassifier? The results should be quite close to yours.

ejlb commented 11 years ago

I will compare, the original paper did some comparisons with SGD (not sklearn's implementation) and they found that the projection step and adaptive learning rate improved performance.

amueller commented 11 years ago

The SGD in scikit-learn actually has an adaptive learning rate - it can even be set to be the same as pegasos, I believe. For the projection step, the claims are much milder in the journal version of the paper and in the source code they provide it is commented out. I have not seen a careful analysis of the projection step, though, and would be quite interested in that.

amueller commented 11 years ago

After looking it up again, I think you need to set power_t=1 to get the pegasos schedule.

ejlb commented 11 years ago

Here are some benchmarks with identical learning rates:

https://raw.github.com/ejlb/pegasos/master/benchmarks/benchmarks.png

Pegasos seems to be slightly more accurate (1%). The only two differences I know of are:

1) pegasos projection 2) pegasos trains on random examples so may get a better generalisation error.

Due to point 2) it is hard to compare speed across iterations.

amueller commented 11 years ago

Wow that looks quite good. I'm quite surprised your implementation is significantly faster than sklearn. Do you have any idea where that could come from? Also, could you please share your benchmark script?

cc @pprett @larsmans

amueller commented 11 years ago

You say that training on random samples makes it had to compare speed.s How so? One iteration of sgd are n_samples many updates, which you should compare against n_samples many updates in pegasos. Or did you compare against single updates here?

ejlb commented 11 years ago

@amueller SGDClassifier trains on the whole data set at each iteration I assume? It is probably where the speed increase comes from

edit: yes true, that would be a good comparison. Will upload the benchmark script

amueller commented 11 years ago

Ok, but then the plot doesn't make sense. You should rescale it such that the number of weight updates is the same.

ejlb commented 11 years ago

Yeah, will run some with equal weight updates

larsmans commented 11 years ago

Yes, SGDClassifier does

for i in xrange(n_iter):
    shuffle(dataset)
    for x in X:
         update()

It also wastes a little bit of time in each update, checking whether it should do a PA update or a vanilla additive one.

ejlb commented 11 years ago

this makes much more sense:

https://raw.github.com/ejlb/pegasos/master/benchmarks/weight_updates/benchmarks.png

Perhaps batching the pegasos weight updates would retain the slight accuracy boost and improve the training time

amueller commented 11 years ago

Yeah, that looks more realistic ;) How did you set alpha and did you set eta0 in the SGD?

ejlb commented 11 years ago

I used this: SGDClassifier(power_t=1, learning_rate='invscaling', n_iter=sample_coef, eta0=0.01). The full benchmark is here: https://github.com/ejlb/pegasos/blob/master/benchmarks/weight_updates/benchmark.py