orionw / FollowIR

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
https://arxiv.org/abs/2403.15246
40 stars 0 forks source link

Training template #8

Open ZZZYYYLL opened 1 week ago

ZZZYYYLL commented 1 week ago

Hi, thanks for the great work. I have a question for how to transforming the training dataset to fit llama_factory format

I'd like to ask for advice on how to properly construct the training data format for llama_factory fine-tuning. I found FollowIR-7B's training set on huggingface, and the format is as follows:

{
  "score": "the score from Mistral-Instruct-7B-v0.2 of whether it was relevant or not (1 is relevant, 0 is not)"
  "label": "the label of relevance from GPT-3.5-Turbo-1106 who created the document"
  "id": "the id from the original TREC track and the file it came from"
  "document": "the synthetic document produced by GPT-3.5-Turbo-1106 given the original instruction, query, and label"
  "query": "the query written by TREC"
  "instruction": "the instruction (or narrative) written by TREC for human annotation"
}

For fitting the llama_factory 's format, Should the format I build for fine-tuning look like this:

{
   "instruction": "<s> [INST] You are an expert Google searcher, whose job is to determine if the following document is relevant to the query (true/false). Answer using only one word, one of those two choices.\n"
   "input": "Query: {query}  {instruction}\n Document: {document}\n Relevant (only output one word, either \"true\" or \"false\"): [/INST]"
   "output": "{label}"
}

I will appreciate it if you can give me an example for it.

orionw commented 1 week ago

Thanks for the interest! Here's an example: https://github.com/orionw/FollowIR/issues/5#issuecomment-2330372098

Your format looks correct offhand, but I would probably do a diff to be certain. EDIT: ah I think you're adding the Mistral tokens, but llama factory does that with the --template flag. There are probably a few other small differences like that between the two.