zyushun / Adam-mini

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
258 stars 9 forks source link

Add a setup.py and some pip install related changes #10

Open Mrw33554432 opened 1 month ago

Mrw33554432 commented 1 month ago

Now you can install it as a package using pip install .. I have not fully verified the code, but ideally they should work fine just like other pip packages. Also, there are some other stuff that need extra care, such as change author_name/license/apply the new version inside examples. And afterward I guess you can upload it to pip for convenient usage.

zyushun commented 1 month ago

Hi @Mrw33554432 ! Thanks so much for your interests and support! Sorry for the late reply.

We have done the pip installation (thanks for the template that you provided!) You may now use Adam-mini as follows.

git clone https://github.com/zyushun/Adam-mini
pip install -e .

from adam_mini import Adam_mini

optimizer = Adam_mini(
            named_parameters = model.named_parameters(), 
            lr = lr, 
            betas = (beta1,beta2), 
            eps = eps,
            weight_decay = weight_decay, 
            model_sharding = True,
            dim = model_config.dim,
            n_heads = model_config.n_heads,
            n_kv_heads = model_config.n_kv_heads,
            )

Thanks again for your great help! I