Closed karpathy closed 5 days ago
Introduced two new flags:
fprintf(stderr, " -sl <float> outlier stability: skip update if loss goes above this in zscore (0.0f=off, default=3.0f)\n");
fprintf(stderr, " -sg <float> outlier stability: skip update if grad_norm goes above this in zscore (0.0f=off, default=3.0f)\n");
They default to 0.0 (old behavior) but if they are e.g. 2.5, then if the loss or grad z score are > 2.5, the update is skipped.
still TODO:
gpt2_update
function)