Closed 1170300714 closed 6 months ago
In addition, what is the different between ''Instruction-Tuning'' , "Code Adaptation" and ''Task Finetuning" in terms of training paradigm? In practice, I have reproduce the ''Task Finetuning'' in your paper by the open_instruct/finetune.py which you released. For Instruction-Tuning and Code Adaptation, which corresponding training code should I run to start the experiment?
Thanks a lot~
Hi author, I also need the training data of AlpacaFarm, ToxiGen, TruthfulQA, CodexEval and DS-1000, could you please share with this?
收到,感谢您的来件
We do not use training data for any of the tasks you listed. For instruction-tuning and code adaptation experiments, we use off-the-shelf models (in other words, we did not do any training ourselves). Moreover, the experts are supposed to be general-purpose instruction-following and code models, respectively — they are not task specific.
For the instruction-tuning experiments, for the expert we use a general instruction-tuned model from the Llama2 chat series, which you can find here on HuggingFace models. For code adaptation experiments, we use a general code model, the CodeLlama series, here. You can see how to do evaluation in the eval scripts.
The three sections evaluate three common use cases of tuning: instruction-tuning, domain adaptation, and task-specific finetuning. The expert model for each section is chosen accordingly.
Thanks for your reply!
Hi, Thanks for your great job! Could you please share the training data of AlpacaFarm, ToxiGen, TruthfulQA, CodexEval and DS-1000? In fact, I can only find the training data of GSM and TriviaQA in https://github.com/alisawuffles/proxy-tuning/issues/3.
Thanks a lot!