hkust-nlp / deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Apache License 2.0
458 stars 28 forks source link

Question about which score to ultimately use for the filtering process. #19

Closed 447428054 closed 5 months ago

447428054 commented 6 months ago

As illustrated in the left part of Figure 1, we then ask ChatGPT to rank and score these 6 samples (prompt in Appendix E.2), obtaining the complexity scores c corresponding to the instructions. We emphasize that, distinct from direct scoring, we give ChatGPT all 6 samples within one prompt – these samples represent different evolution stages of the same original sample and such a scoring scheme helps ChatGPT capture the small complexity differences among them, which leads to complexity scores to achieve finer-grained complexity differentiation among samples.

So the score used as filter is the highest complexity or the original score? Is the use of Evol the method to make the score more confident?

VPeterV commented 6 months ago

Hi. We use the original score. Yes. The Evol method and rank + score make the score more confident.

VPeterV commented 5 months ago

I'm closing this issue for now. If you have any additional questions or concerns, please don't hesitate to reopen it.