Closed msalganik closed 1 month ago
I have added support for custom datasets. The README now has instructions that should illustrate what the data should look like. I am going to leave this issue open until testing has been completed with your data.
This is confirmed working to be working.
The code takes .jsonl files with "book_content" and "outcome" to represent books of life and outcomes, respectively.
Implementation in llama-recipes:
Add the following flags to the command line prompt:
-- dataset predefined_dataset
-- data_path /path/to/directory/containing/jsonl/files/
I would like to be able to modify the code so that it runs fine-tuning using a dataset that I specify.
This dataset could be in HF format if you prefer.