🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
@nairbv I will do that once we make the repo install-able (if we decide to do that), otherwise moving it around will break many places as currently it relies on relative import and relative job submission.
Moved scripts to
scripts/
folder.Current way to call would be, e.g.
sbatch scripts/train.slurm
in the root repo folder.We can relax the job-submission folder constraints in the future by make it install-able.