Open playertr opened 3 years ago
Note: garbage inputs may cause frequent momentary increases in loss, throwing off Adam's loss memory.
Could use SGD with momentum, or tweak Adam hyperparameters to have less memory.