I have trained with the same dataset on Llama2 to see if it could've been that but i was able to just fine, was even able to download the model convert it and run it locally to ensure and isolate it to being related somehow to training the Lora on Llama 3
Which Operating Systems are you using?
[X] Linux
[ ] macOS
[ ] Windows
Python Version
3.10.14
axolotl branch-commit
Main (Running on Jarvis.ai)
Acknowledgements
[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this bug has not been reported yet.
[X] I am using the latest version of axolotl.
[X] I have provided enough information for the maintainers to reproduce and diagnose the issue.
Please check that this issue hasn't been reported before.
Expected Behavior
Train Llama 3 based model on dataset and output folder
Current behaviour
Model completes training then says it is saving, outputs a bunch of text and never saves.
Steps to reproduce
Accelerate launch -m axolotl.cli.train examples/llama-3/lora-8b.yml
Model loads and trains
Tries to save but outputs a load of text
[INFO] [axolotl.train.train:173] [PID:1578] [RANK:0] Training Completed!!! Saving pre-trained model to ./outputs/lora-out
Config yaml
Possible solution
I have trained with the same dataset on Llama2 to see if it could've been that but i was able to just fine, was even able to download the model convert it and run it locally to ensure and isolate it to being related somehow to training the Lora on Llama 3
Which Operating Systems are you using?
Python Version
3.10.14
axolotl branch-commit
Main (Running on Jarvis.ai)
Acknowledgements