Submitters have invested in both LARS and SGD for this round and the rules were not very clear during this cycle. We seek to enable submitters who invested in either optimizer to successfully submit this round. We want to enable both SGD with polynomial learning rate schedules and LARS at lower batch sizes.
Choice of optimizer is a hyper parameter
LARS and SGD (with poly learning rate) are both allowed at all batch sizes for resnet.
Hyper parameters to be restricted by the table.
SGD must use polynomial learning rate as outlined by the HP table
SGD with polynomial learning is allowed even though it has not been added to the reference; the reference is to be updated.
Proposal:
Submitters have invested in both LARS and SGD for this round and the rules were not very clear during this cycle. We seek to enable submitters who invested in either optimizer to successfully submit this round. We want to enable both SGD with polynomial learning rate schedules and LARS at lower batch sizes.