The objective of this experiment is to determine the convergence value of a long running experiment on a small model. In case of #52, this value was 1.1.
Based on #48, I've determined that an LR scaling factor of 1.0 might be more beneficial to ensure stability. I've also noticed that smaller models will have a loss function that behaves in a more deterministic manner, so I'm not creating several runs for this one as I have no reason to believe that the loss graphs would diverge (no reason yet).
Coming from:
52
48
The objective of this experiment is to determine the convergence value of a long running experiment on a small model. In case of #52, this value was 1.1.
Based on #48, I've determined that an LR scaling factor of 1.0 might be more beneficial to ensure stability. I've also noticed that smaller models will have a loss function that behaves in a more deterministic manner, so I'm not creating several runs for this one as I have no reason to believe that the loss graphs would diverge (no reason yet).