Closed PootieT closed 1 year ago
Agreed. But, we may need to generalize this to work on lists of numbers as well.
seems like check-within
allows comparison in between lists
(check-within (list 0 2.0 3 5 9 123) (list 0 2 3 5 9 123) 0.01) ; passes
although, in this one weird case, one program returned a set
, with all elements the same as the expected output, but as a list
, and in this case, no current checking method allows the two values to be the same. Perhaps for the best..
(check-match (set 0 2 3 5 9 123) (list 0 2 3 5 9 123)) ; does not pass
Conveniently, it seems like check-within
supports heterogeneous lists too:
Welcome to Racket v8.2 [cs].
> (require rackunit)
> (check-within '("hi" 2) '("hi" 2.001) 0.05)
> (check-within '("hi" 2) '("hi" 2.1) 0.05)
--------------------
; FAILURE [,bt for context]
name: check-within
location: readline-input:3:0
actual: '("hi" 2)
expected: '("hi" 2.1)
--------------------
>
So, we should be able to just use check-within
instead of check-equal?
Fixed. Racket performance on a model increases slightly from 10.62% to 11.19%. I suspect with better Racket training data, it will have more of an impact.
Example program:
HumanEval_99_closest_integer
This is the current test
which outputs:
Here are some alternatives we may consider (source):
All of them would pass with the same inputs. The second and third version checks equivalence with small error range.