Question about AI Feedback (AIF)

In the AI Feedback (AIF) phase, with GPT-4 serving as the teacher model,I am curious to know if there might be a propensity for GPT-4 to assign higher ratings to its own outputs？

Additionally, I am interested in the statistical distribution of various large language models chosen as ${y_w}$ during the AI Feedback (AIF) evaluation in your study. Have you conducted an analysis on how frequently different LLMs were selected for this purpose?

Thank you!

huggingface / alignment-handbook

Question about AI Feedback (AIF) #90