Closed kuailehaha closed 3 months ago
Thanks for raising the issue. The misleading output messages and wrong logic statements are all due to version issue of our working code uploaded at that time. We have updated new version of our code and corrected all the issues. The code should be working as expected now. We have also aligned all scores in our arxiv & camera ready version with output of current code version. Sorry for the confusion.
For reproducibility, please see Parameter Reference
section in each sub-repo for details.
Description
I've encountered a critical issue in the optimizer configuration code where the conditions seem to be reversed or incorrectly labeled. This issue not only leads to potential misuse of optimizer parameters but also outputs incorrect console messages that could mislead any user of this library.
Issue Details
In the current implementation of
GPT2/examples/NLG/optimizer_custom.py
, the conditions for selecting optimizers are reversed or incorrectly implemented. Here's the problematic part of the code:Problems
scaled_adamw
, incorrectly initializesAdamW
. The second condition, meant foradamw
, incorrectly initializesAdamWr
and includes a parameterrank=args.lora_dim
, which is not applicable for the standard AdamW optimizer but suggests a specialized version perhaps related to the LoRA architecture.Expected Behavior
Each branch should accurately reflect the optimizer being used, and the console output should match the actual optimizer configuration. The parameters and optimizer choice should align correctly with the user's input.
Request
I urge the maintainers to:
Additional Context
This misconfiguration is critical as it affects the reproducibility and integrity of the results presented in your experiments (refer to Table 1 from the attached image in this issue). Please provide an updated and correct configuration to avoid further confusion and to maintain the credibility of the research.
Thank you for addressing this issue promptly.