seiyab / adabelief-toy-examples-pytorch

numerical experiments in https://arxiv.org/pdf/2010.07468.pdf
8 stars 1 forks source link

Request for Comparison Results of AdaBelief and Adam on the CIFAR-10 dataset #1

Open wgqhandsome opened 6 months ago

wgqhandsome commented 6 months ago

Hello,

Could you please provide some comparison results between AdaBelief and Adam on the CIFAR-10 dataset? Specifically, I am interested in the following information:

  1. A comparison of the training speed and convergence of both algorithms.
  2. A comparison of the accuracy of the models trained by both algorithms on the test set.
  3. Any stability issues encountered during the training process with either algorithm.

Any related information you could provide would be greatly appreciated.

Thank you!

seiyab commented 6 months ago

To be honest, I don't have GPU. This is why this repository focuses on only toy examples... So I won't add results for CIFAR-10.

I'm sorry not to meet your request.

wgqhandsome commented 6 months ago

To be honest, I don't have GPU. This is why this repository focuses on only toy examples... So I won't add results for CIFAR-10.

I'm sorry not to meet your request.

Thank you very much for your reply. I have another question: why are the settings in your code different from the typical default values (such as momentum and betas)? Were these parameters carefully selected, or were they set arbitrarily? Here is the relevant code(src/optimizers/optimizers.py):


fig3d_optims = {
    "SGD + Momentum": kw(optim.SGD, lr=10**-3, momentum=0.3, dampening=0.3),
    "SGD + Momentum (α=10^-6)": kw(optim.SGD, lr=10**-6, momentum=0.3, dampening=0.3),
    "Adam": kw(optim.Adam, betas=(0.3, 0.3)),
    "AdaBelief": kw(AdaBelief, betas=(0.3, 0.3), rectify=False),
}
seiyab commented 6 months ago

I referred original paper https://arxiv.org/pdf/2010.07468v1. In "Numerical experiments" section, they say:

we set the parameters of AdaBelief to be the same as the default in Adam [8], β1 = 0.9, β2 = 0.999,  = 10−8, and set momentum as 0.9 for SGD. For Fig. 3(d), to match the assumption in Sec. 2.2, we set β1 = β2 = 0.3 for both Adam and AdaBelief, and set momentum as 0.3 for SGD.

wgqhandsome commented 6 months ago

GitHub is truly a great platform, allowing me to easily connect with talented individuals like you. It's fantastic!

wgqhandsome commented 6 months ago

I referred original paper https://arxiv.org/pdf/2010.07468v1. In "Numerical experiments" section, they say:

we set the parameters of AdaBelief to be the same as the default in Adam [8], β1 = 0.9, β2 = 0.999, � = 10−8, and set momentum as 0.9 for SGD. For Fig. 3(d), to match the assumption in Sec. 2.2, we set β1 = β2 = 0.3 for both Adam and AdaBelief, and set momentum as 0.3 for SGD.

I only know a little Python. How can I conveniently obtain the images by running your code?

seiyab commented 6 months ago

First, to install dependency, you need pipenv. After installing pipenv, you can enter virtual environment by pipenv shell. Running pipenv install, dependencies will be installed into virtual environment. run.sh will output images.

Perhaps it might not work. It's too old and