Closed fgenie closed 4 months ago
total: 1099 / 1319 fail: 0 / 1319 nonconflict: 1059 / 1160 conflict: 40 / 159
83.32 %
total: 1122 / 1319 fail: 0 / 1319 nonconflict: 1059 / 1160 conflict: 63 / 159
85.06 %
total: 1256 / 1319 fail: 0 / 1319 nonconflict: 1249 / 1297 conflict: 7 / 22
95.22%
total: 1259 / 1319 fail: 0 / 1319 nonconflict: 1249 / 1297 conflict: 10 / 22
95.45%
1037 / 1319 (78.6%)
1061 / 1319 (80.4%)
1003 / 1319 (76.0%)
1210 / 1319 (91.7%)
1238 / 1319 (93.9%)
chatgpt
baseline
total: 1099 / 1319 fail: 0 / 1319 nonconflict: 1059 / 1160 conflict: 40 / 159
83.32 %
rims
total: 1122 / 1319 fail: 0 / 1319 nonconflict: 1059 / 1160 conflict: 63 / 159
85.06 %
gpt4turbo
baseline
total: 1256 / 1319 fail: 0 / 1319 nonconflict: 1249 / 1297 conflict: 7 / 22
95.22%
rims
total: 1259 / 1319 fail: 0 / 1319 nonconflict: 1249 / 1297 conflict: 10 / 22
95.45%