Open JiangshuoZhao opened 6 months ago
您好请问什么事closed问题?按理说训练完成后他会服从你训练样本的格式,不会做过多的生成(注意在训练样本最后加上eos token)
在issues23中,他的中间结果
The following is an interception of some intermediate results:
['Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. # noqa: E501\n\n### Instruction:\nGiven the user's preference and unpreference, identify whether the user will like the target movie by answering "Yes." or "No.".\n\n### Input:\nUser Preference: "Paris, Texas (1984)", "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)"\nUser Unpreference: "Kalifornia (1993)"\nWhether the user will like the target movie "Perez Family, The (1995)"?\n\n### Response:\nYes.\n\n### Explanation:\nThe user prefers "Paris, Texas (1984)", "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (199', 'Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. # noqa: E501\n\n### Instruction:\nGiven the user's preference and unpreference, identify whether the user will like the target movie by answering "Yes." or "No.".\n\n### Input:\nUser Preference: "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)"\nUser Unpreference: "Kalifornia (1993)", "Perez Family, The (1995)"\nWhether the user will like the target movie "Jurassic Park (1993)"?\n\n### Response:\nYes.\n\n### Explanation:\nThe user prefers "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)" and unpreferences "Kalifornia'] ['Yes.\n\n### Explanation:\nThe user prefers "Paris, Texas (1984)", "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (199', 'Yes.\n\n### Explanation:\nThe user prefers "Rebel Without a Cause (1955)", "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)" and unpreferences "Kalifornia'] [[0.5731707811355591, 0.4268292486667633], [0.5828027129173279, 0.4171972870826721]] 1it [00:06, 6.24s/it]['Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. # noqa: E501\n\n### Instruction:\nGiven the user's preference and unpreference, identify whether the user will like the target movie by answering "Yes." or "No.".\n\n### Input:\nUser Preference: "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)"\nUser Unpreference: "Kalifornia (1993)", "Perez Family, The (1995)"\nWhether the user will like the target movie "Manhattan Murder Mystery (1993)"?\n\n### Response:\nYes.\n\n### Explanation:\nThe user prefers "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)" and unpreferences "Kalifornia (1', 'Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. # noqa: E501\n\n### Instruction:\nGiven the user's preference and unpreference, identify whether the user will like the target movie by answering "Yes." or "No.".\n\n### Input:\nUser Preference: "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)", "Manhattan Murder Mystery (1993)"\nUser Unpreference: "Kalifornia (1993)", "Perez Family, The (1995)"\nWhether the user will like the target movie "Sleeper (1973)"?\n\n### Response:\nYes.\n\n### Explanation:\nThe user prefers "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)", "Manhattan Murder Mystery (1993)" and unpreferences "Kalifornia (199'] ['Yes.\n\n### Explanation:\nThe user prefers "Return of the Pink Panther, The (1974)", "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)" and unpreferences "Kalifornia (1', 'Yes.\n\n### Explanation:\nThe user prefers "Ace Ventura: Pet Detective (1994)", "Magnificent Seven, The (1954)", "Star Trek: The Wrath of Khan (1982)", "Cat People (1982)", "Orlando (1993)", "Dave (1993)", "Jurassic Park (1993)", "Manhattan Murder Mystery (1993)" and unpreferences "Kalifornia (199'] [[0.5806307196617126, 0.41936925053596497], [0.5781132578849792, 0.42188674211502075]]
生成的是这种格式### Response:\nYes.\n\n### Explanation:\nThe user prefers
你好,这种可能是推理的过程中lora没有装载,正常训练手链的模型一般不会输出Explanation,方便share一下你的环境机器版本吗?
关于evaluate.py文件的evaluate函数(127-165行)的返回的output并没有用,而是用generation_output.scores[0]计算的概率。这里为何用[0]。还有这个函数的默认max_new_token为128,回答了Yes或No后,后面的输出内容是什么样的,关于output有研究过吗?。我在closed问题中看到了一个output会输出”###Explain:...“。但我训练后并没有这种结果。 下面是我在book测试集上前8条的结果