DRAGNLabs / 301r_retnet

2 stars 1 forks source link

Restructure ckpt naming conventions #60

Closed DrewGalbraith closed 3 months ago

DrewGalbraith commented 3 months ago

CKPT files are being names like this rn:

epoch_epoch=0_validation_val_loss=3.08-v1.ckpt
epoch_epoch=0_validation_val_loss=3.08-v2.ckpt
epoch_epoch=0_validation_val_loss=3.08-v3.ckpt

Instead of vn, we want o=to order them in order of their appearance. So the implemented ouput will be:

0_epoch_epoch=0_validation_val_loss=3.08.ckpt
1_epoch_epoch=0_validation_val_loss=3.08.ckpt
2_epoch_epoch=0_validation_val_loss=3.08.ckpt
DrewGalbraith commented 3 months ago

The reason the current situation is not ideal is depicted in the attached screenshot. ckpt_names_sreenshot Though the higher-loss ckpts were generated first, they appear below. Additionally, if validation starts climbing again, it could even make a ckpt from several epochs later appear as .v18 or something following .v17 from much earlier in the training sequence, forcing us to rely on ls -l and like commands to divine creation order. This is way too much work.