Having Troubles Re-Implementing the Training Process

wjcldply commented 5 days ago

Hi, First of all Thank you the inspirational paper. I am currently trying to re-implement the training process, but having some troubles. It would be great if you could give me some guides...

first of all, I think Nuo97/Dolphin-DPO dataset isn't actually on huggingface hub. The dataset card is there, but when I try to load it, it says there's no dataset in that directory. Nuo97/Dolphin_Step1 through Step3 loaded just fine. But this is where my second question comes in, it seems like loaded data all have Train data only and not Validation Data. Yet, when I look at the run_step1.13B.sh code for training, it seems like i have to pass the train set and validation set directories separately. Does this mean that I should load the Nuo97/Dolphin_Step1~3 datasets with huggingface's load_dataset function, manually split them into train and validation sets, and save them? Also, I'd like to know how to run run_step1.13B.sh script exactly. I tried running the command run_step1.13B.sh meta-llama/Llama-2-13b . . ./Dataset/train/<train_data_filename> ./Dataset/train/` but nothing seemed to happen. What am I missing or misunderstanding? Some advice or guides on how to re-implement training process would be greatly appreciated. Thank You :)

jerrynchen commented 5 days ago

Hi, thanks for your question.

for DPO dataset, please refer to https://huggingface.co/datasets/Nuo97/Dolphin-DPO/blob/main/memory_dpo_train.
for the train and validation set, it's okay for you to split the original dataset.

wjcldply commented 1 day ago

Thanks a lot. I'd also appreciate it if you could clarify what # Reminder to shuffle train data in advance! mean in run_step1.13B.sh. Does it mean I should load Nuo97/Dolphin-Task1, Nuo97/Dolphin-Task1, Nuo97/Dolphin-Task1 and merge them into a single training dataset? Thanks a lot in advance.

nuochenpku commented 2 hours ago

Hi. You can just random sample 100-500 samples for evaluation. And we conduct mixed-task instruction tuning, you can fuse three tasks in a single file, resulting in slighly better performances.

nuochenpku / COMEDY

Having Troubles Re-Implementing the Training Process #1