pilancilab / Riemannian_Preconditioned_LoRA

source code for paper "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models"
MIT License
16 stars 1 forks source link

Incorrect Optimizer Configuration Leads to Misleading Output Messages #1

Closed kuailehaha closed 3 months ago

kuailehaha commented 4 months ago

Description

I've encountered a critical issue in the optimizer configuration code where the conditions seem to be reversed or incorrectly labeled. This issue not only leads to potential misuse of optimizer parameters but also outputs incorrect console messages that could mislead any user of this library.

Issue Details

In the current implementation of GPT2/examples/NLG/optimizer_custom.py, the conditions for selecting optimizers are reversed or incorrectly implemented. Here's the problematic part of the code:

if args.opt == "scaled_adamw":
    print('USING OPTIMIZER SCALED AdamW')
    optimizer = AdamW(
        grouped_parameters,
        lr=args.lr,
        betas=(args.adam_beta1, args.adam_beta2),
        eps=args.adam_epsilon,
        weight_decay=args.weight_decay,
        correct_bias=args.correct_bias
    )
if args.opt == "adamw":
    print('USING OPTIMIZER SCALED AdamW')
    optimizer = AdamWr(
        grouped_parameters,
        lr=args.lr,
        betas=(args.adam_beta1, args.adam_beta2),
        eps=args.adam_epsilon,
        weight_decay=args.weight_decay,
        correct_bias=args.correct_bias,
        rank=args.lora_dim
    )

Problems

  1. Both conditions output the same message "USING OPTIMIZER SCALED AdamW", which is incorrect for one of the branches.
  2. The first condition, meant for scaled_adamw, incorrectly initializes AdamW. The second condition, meant for adamw, incorrectly initializes AdamWr and includes a parameter rank=args.lora_dim, which is not applicable for the standard AdamW optimizer but suggests a specialized version perhaps related to the LoRA architecture.

Expected Behavior

Each branch should accurately reflect the optimizer being used, and the console output should match the actual optimizer configuration. The parameters and optimizer choice should align correctly with the user's input.

Request

I urge the maintainers to:

Additional Context

This misconfiguration is critical as it affects the reproducibility and integrity of the results presented in your experiments (refer to Table 1 from the attached image in this issue). Please provide an updated and correct configuration to avoid further confusion and to maintain the credibility of the research.

Thank you for addressing this issue promptly.

fangzhaoz commented 4 months ago

Thanks for raising the issue. The misleading output messages and wrong logic statements are all due to version issue of our working code uploaded at that time. We have updated new version of our code and corrected all the issues. The code should be working as expected now. We have also aligned all scores in our arxiv & camera ready version with output of current code version. Sorry for the confusion.

For reproducibility, please see Parameter Reference section in each sub-repo for details.