Closed pandaupc closed 4 months ago
论文中提到: We obtain code preference data based on compiler-feedback, and mathematical preference data based on the ground-truth labels 可以详细讲一下是如何做的吗?
技术报告外的信息暂无披露计划 @pandaupc
论文中提到: We obtain code preference data based on compiler-feedback, and mathematical preference data based on the ground-truth labels 可以详细讲一下是如何做的吗?