uclaml / SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)
https://uclaml.github.io/SPIN/
Apache License 2.0
1.05k stars 92 forks source link

Question about potential overfitting #38

Open kang-0909 opened 1 month ago

kang-0909 commented 1 month ago

Interesting work! I have a question about the experiments, particularly regarding the risk of overfitting.

From my understanding, it seems that the proposed method might encourage overfitting, especially with limited training data. However, in your results, the model seems to generalize well without showing signs of overfitting.

Will the performance continue to improve with the number of iterations? If so, can you explain why it wouldn't overfit?