Closed Belzedar94 closed 1 month ago
Thank you for your suggestion! Due to the high access restrictions of the o1-preview, as well as the higher costs associated with the internal reasoning tokens, and the fact that the o1-preview does not currently support multimodal input, we have not yet tested the complete full set of results. However, we have tested a subset of the results (https://x.com/Z_Huang_02/status/1834634575345270898), which can still reflect some qualitative conclusions.
At the same time, considering that o1-preview introduces an additional internal reasoning process before answering, the fairness of directly comparing it with other models is still debatable.
Request in title. Thanks :)