Unclear if logging all model checkpoints to wandb will be too big if we're taking a lot of checkpoints. We likely just want to log training metrics and configs and have something which uploads the checkpoints to huggingface.
An example of wandb logging is here, though I think this setup is pretty gross even though it does get around some wandb issues with sweep configs.
Fine to start by assuming we won't be sweeping and having a simpler setup.
We should log our experiments with wandb.
Unclear if logging all model checkpoints to wandb will be too big if we're taking a lot of checkpoints. We likely just want to log training metrics and configs and have something which uploads the checkpoints to huggingface.
An example of wandb logging is here, though I think this setup is pretty gross even though it does get around some wandb issues with sweep configs.
Fine to start by assuming we won't be sweeping and having a simpler setup.