Closed Maykeye closed 1 year ago
Thanks for the report! I'm with you on just dropping the whole first line idea, thinking switching from a list of hard-coded prefix to just ```(\w*) would be the most generic fix so we don't have to worry about any future permutations.
Thanks again @Maykeye I've pushed fixes to enable evaluate to be used as a library and for a language-agnostic code block start of search. Could you give it a try? I have opened a new issue #7 to track your first point that some of the interview executors are including the prompt in their output and that's undesirable.
Works fine for Guanaco-33b now. I also checked OpenAssistant_oasst-sft-1-pythia-12b
, it only fails for
def fact(n):
if n == 0:
return 1
else:
return n * fact(n - 1)
Explanation:
The function takes an integer as an argument, n. If n is zero, it returns 1 since there is no factorial for zero. Otherwise, it recursively calls itself with n minus one as the argument until n is equal to 1, at which point the function returns the value of n multiplied by the result of the recursive call.
which is returned verbatim(here model ignores instruction about ``` and ignores instruction about not providing explanation), but that's separate issue and I'm not sure if evaluate.py should go extra miles after models that don't do what they asked to, original issue is solved
I tried to run on Guanaco-33b by modyfing interview-llamacpp.sh, and hit the issue with evaluator
Consider this
This function takes a single argument
n
, which represents the number for which you want to calculate the factorial. The function first checks ifn
is equal to 0. If it is, then the function returns 1, as the factorial of 0 is defined to be 1. Otherwise, the function calculates the factorial recursively by multiplyingn
with the result of calling the same function onn - 1
.There are two issues at play: 1) ``` is a part of the included prompt.
extract_code will always find it, because it always present.
So extract code firstly things that ''' in prompt is a start of the actual code block, then it mistakes start of the real codeblock for the end of the codeblock
2) extract_code doesn't support '''js. Only '''javascript E.g. in github Js alias is valid for markdown of javascript
So even if 1) were fixed, current extract_code wouldn't parse the answer correctly: it would return
'js\nfunction fact(n) {\n if (n === 0) {\n return 1;\n } else {\n return n * fact(n - 1);\n }\n}'
without removingjs
tagMaybe whole line with first ''' should be erased?
(On similar note,
'''py
seems to be valid language name for python, at least as github markdown engine is concerned)(Also evaluate.py has no
__name__ == __main__
guard), which makes importing it from ipython impossible without extracting the function first)