Linear95 / SPAG

Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024
Apache License 2.0
84 stars 10 forks source link

Question about experiment in paper #2

Open yuezhao238 opened 5 months ago

yuezhao238 commented 5 months ago

Hi, thanks for this innovative work!

May I ask if you have conducted more ablation experiments on the data (e.g., just train more steps in imitation learning phase)?I think it's necessary to justify the efficiency of RL phase.

Thanks for your precious time!

Linear95 commented 5 months ago

Thank you for your recognition of our work! The current draft contains the main results to show the effectiveness of SPAG. We are definitely working on more ablation experiments and will update the paper as soon as possible. I will reply to this issue again after the paper is updated.

yuezhao238 commented 5 months ago

Thanks for your reply! I will keep an eye on this.

Linear95 commented 4 months ago

Hi Zhao, we have updated the paper with more experiments, including the sample efficiency as you mentioned. Please check the paper again if it might be helpful :)

yuezhao238 commented 4 months ago

Thanks for informing me! I'll check it out later❤️