Better construction of reflection data in o1-journey?

GAIR-NLP / O1-Journey

O1 Replication Journey: A Strategic Progress Report – Part I

1.53k stars 43 forks source link

Better construction of reflection data in o1-journey? #5

Open YuMS opened 1 month ago

YuMS commented 1 month ago

It seems that some o1-journey reflection data publiced on Hugging Face are actually correcting correct reasoning steps. Is it possible that there are still room in how the reasoning tree is traversed and how reflection is constructed.

For example, I randomly sampled instances related to the keyword "wait". Every single checked reflection (rows 19, 39, 56, 75) have reflections that appeared to be unnecessary

https://huggingface.co/datasets/GAIR/o1-journey/

YuMS commented 1 month ago

BTW, Really appreciate your work. I think the construction of o1-journey data is an important first step.

codelion commented 4 weeks ago

You can use cot_reflection approach to construct data for cot with reflection using optillm - https://github.com/codelion/optillm