Closed dtingey closed 8 months ago
@KimballNJardine @nprisbrey Merge conflicts are resolved and I have tested that grid search and training are still functional. Feel free to take a look/ test it out and see if it works.
Jk it is no longer ready to merge. Lightning has been merged into it and we are working on getting that working. I think the final data run will be on the hugging_faceify branch.
For clarity: Hugging face works, lightning works, grid search works, evals work. Don't ignore those parts.
This PR has two main goals:
RetNetModel
andTransformerModel
into a Hugging FacePreTrainedModel
.run_eval.py
andslurm/run_eval.sh
This allows us to save and load the model configuration and weight very easily, as well as obviously allows us to run the lm-evaluation harness on any models that come out of training and grid-search.