Closed EcustBoy closed 1 year ago
thank you for your interest in this work.
we have used the lr decay at configure_optimizers function https://github.com/OpenPerceptionX/ST-P3/blob/main/stp3/trainer.py#L459
and for different tasks, the lr may be different, while there is always lr_weight_decay https://github.com/OpenPerceptionX/ST-P3/blob/main/stp3/config.py#L152
Dear author: I notice that the learning rate(lr) is a constant value during the whole training(e.g. 2e-4 during planning stage training), I wonder if it would be better to use lr scheduler, have you ever tried lr decay?