Closed kanseaveg closed 5 months ago
The bird result in the paper is based on GPT-4-32K. Since the BIRD dev set consists of 1,534 examples and the evaluation cost for a single run is expensive, we have not tested the performance of other GPT-4 versions. However, I believe the differences should not be significant.
Then you have not eaten the dividend of The Times, if the effect of GPT4-32K, I have seen on the DEA-SQL before the use of GPT-4 to reproduce the results are not ideal.
In question #7 , I see that the results of spider in the paper are obtained by GPT-4-32K, and I would like to ask which big model the results of bird are based on: GPT-4,GPT-4-0613,GPT-4-16K,GPT4-32K