THUDM / WebGLM

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Apache License 2.0
1.55k stars 134 forks source link

向量检索模型训练数据生成的具体方法可以说一下吗 #41

Open campuslifeceo opened 1 year ago

campuslifeceo commented 1 year ago

包括gpt3的promt 以及label的计算方法, 我用rouge-1算了一下训练集中的query和ref,感觉对不上啊。 下面例子中的0.8和0.2是怎么算的, 我咋算不对呢?

example: {'question': 'Why does wine taste better the older it gets?', 'positive_reference': '"No, wine does not always taste better with age. This is because tannins, which give wine its astringent taste, break down over time. However, some wines may taste better after being exposed to oxygen.', 'positive_label': 0.8, 'negative_reference': 'Does wine taste better with age? The answer is: maybe! We’ll take a look at the why, how, and what of aging wine so you can be a more discerning drinker.', 'negative_label': 0.2}

sticktoFE commented 1 year ago

同问同问

hjs2027864933 commented 1 year ago

同问同问