Closed 34127chi closed 1 year ago
We first collect open source datasets that commonly have only questions and answers (perhaps some additional information and references). Then we utilize GPT-3.5 Turbo to refine these datasets.
Behavior shaping: Legal questions often follow a specific thinking pattern: law article + fact in the question -> conclusion, i.e., legal syllogism 法律三段论. We tell ChatGPT to think of a problem in this way, and give it all available information in the original dataset. ChatGPT can then give an output following this thinking pattern, while not suffering from illusions because we have already given it the correct/reference answer. This can then be used as our training data.
Knowledge expansion: for multiple choice questions, merely a correct choice gives only minimal information. Again we give ChatGPT the question and the correct option to choose, telling it to give full deduction. By giving the correct option we (hopefully) avoid ChatGPT from generating a wrong deduction. This full deduction can provide much more information when training then just giving a label.
Thinking development: this is already discussed in the first point. For more information please refer to the CoT paper.
More technical details are in the report.
请问General数据怎么从Alpaca-GPT4 和 Firefly 采样的, 是有配比关系吗?
In practice, we used all Alpaca-GPT4-zh, evenly sampled Firefly
We first collect open source datasets that commonly have only questions and answers (perhaps some additional information and references). Then we utilize GPT-3.5 Turbo to refine these datasets.
- Behavior shaping: Legal questions often follow a specific thinking pattern: law article + fact in the question -> conclusion, i.e., legal syllogism 法律三段论. We tell ChatGPT to think of a problem in this way, and give it all available information in the original dataset. ChatGPT can then give an output following this thinking pattern, while not suffering from illusions because we have already given it the correct/reference answer. This can then be used as our training data.
- Knowledge expansion: for multiple choice questions, merely a correct choice gives only minimal information. Again we give ChatGPT the question and the correct option to choose, telling it to give full deduction. By giving the correct option we (hopefully) avoid ChatGPT from generating a wrong deduction. This full deduction can provide much more information when training then just giving a label.
- Thinking development: this is already discussed in the first point. For more information please refer to the CoT paper.
More technical details are in the report.
Would you mind providing more information about the method Knowledge expansion ? My question is, how do you ensure the correctness when asking ChatGPT to give full explanations about the correct or wrong options? In my experience, ChatGPT is currently not very good at knowledge about Chinese laws and related analysis. And it's often hard to tell when ChatGPT will give unreliable results because it's really good at making stories.
@yucc-leon xref #19 I have answered your question there, so let's continue our conversation in that issue instead.
Closing as completed.
试用了下产品demo,有点出乎意料的好。 但看到并未公开问答数据集合,想问下具体是如何构建问答数据集。 从技术报告上说是通过Behavior Shaping、Knowledge Expansion、Thinking Development方法构建的,但没想明白具体是如何利用这三种方法构建问答数据集的