The choice of optimizers for the Reddit dataset

liu-jc commented 5 years ago

Hi there,

I found you used LBFGS optimizer for the Reddit dataset. And you also claimed this in the paper. I wonder why you chose this optimizer. Why not just use SGD?

And what kind of the optimizer did you use for text classification and semi-supervised geolocation classification when you reported the efficiency? Because I thought different optimizers have different speed efficiency and memory efficiency.

Could you provide any insights? Thanks!

Tiiiger commented 5 years ago

hi @liu-jc ,

L-BFGS, as I believe, is one of the most popular training algorithm for linear models. Since SGC is linear, we showcase its potential to benefit from second-order optimization.

For text-gcn, we also used L-BFGS when report, as can be seen at https://github.com/Tiiiger/SGC/blob/master/downstream/TextSGC/train.py#L59.

You can try to switch to adam. In my experience, there is no drastic difference in terms of speed and performance. You do want to tune the learning rate and early stop criterion a little bit.

@felixgwu anything to add?

felixgwu commented 5 years ago

For geolocation, we use ADAM following the baseline here .

Tiiiger / SGC

The choice of optimizers for the Reddit dataset #14