DRAGNLabs / 301r_retnet

2 stars 1 forks source link

Update Validation Trigger Equation #14

Closed nprisbrey closed 5 months ago

nprisbrey commented 6 months ago

While working on the HF_tokenizers branch, I found that I could get the validation to skip triggering at the 33% mark:

image

As seen above, however, the validation run was triggered at the 67% mark. I suspect that this might be due to a rounding issue with floats. Note that this also coincidentally happened when the batch size was increased to an irregularly high number while running the retnet script. The contents of retnet.sh are given below:

python3 ../train_model.py \
    --activation-dropout 0.0 \
    --dropout 0.0 \
    --checkpoints \
    --embed-dim 32 \
    --ffn-dim 64 \
    --fsdp \
    --layers 1 \
    --lr 0.001 \
    --model retnet \
    --heads 4 \
    --seq-len 32 \
    --value-embed-dim 32 \
    --vocab-size 28783 \
    --device cuda \
    --epochs 1 \
    --batch-size 1024 \
    --rand-seed 42 \
nprisbrey commented 5 months ago

Resolved as of PR #18.