karpathy / llm.c

LLM training in simple, raw C/CUDA
MIT License
21.28k stars 2.31k forks source link

add outlier detector, test for it, and start tracking z score of loss #637

Closed karpathy closed 5 days ago

karpathy commented 6 days ago

still TODO:

karpathy commented 6 days ago

Introduced two new flags:

    fprintf(stderr, "  -sl <float> outlier stability: skip update if loss goes above this in zscore (0.0f=off, default=3.0f)\n");
    fprintf(stderr, "  -sg <float> outlier stability: skip update if grad_norm goes above this in zscore (0.0f=off, default=3.0f)\n");

They default to 0.0 (old behavior) but if they are e.g. 2.5, then if the loss or grad z score are > 2.5, the update is skipped.