Closed Wehzie closed 1 year ago
Same ISSUE !!!
This cause by the following code from lm_eval/api/task.py. The test_target
is int !!!, I believe the correct code would be test_target = [self.doc_to_choice(test_doc)[test_target]]
. This seems like a mistake typo.
if type(test_target) is list:
self.multiple_target = len(test_target)
else:
if (type(test_target) is int) and (test_choice is not None):
# The `test_target` is int !!!, I believe the correct code would be `test_target = [self.doc_to_choice(test_doc)[test_target]]`
test_target = [self.doc_to_choice(test_target)[test_target]]
else:
test_target = [test_target]
if test_choice is not None:
check_choices = test_choice
else:
check_choices = test_target
same here, I always get jinja2.environment.Template.render() argument after ** must be a mapping, not int
for many benchmarks (arc_challenge, hellaswag, sciq) on big-refactor.
@lintangsutawika does one of your recent PRs fix this issue, or do we still need to push a fix for this?
Oh no, real sorry I missed this issue. And yes, I remember having this same issue and patching a fix.
No worries! Just was wondering if #819 fixed this. I think we're also encountering an issue where Hellaswag fails to download properly from Github's CDN as of today/yesterday.
Just confirming #819 fixes this! Tested with:
python -m lm_eval \
--model hf \
--model_args pretrained=EleutherAI/pythia-160m,dtype="float" \
--tasks hellaswag,arc_challenge \
--device cuda:0 \
--batch_size 8
on most recent commit from big-refactor
and evaluations run and finish.
Dear maintainers, I fail to run the hellaswag benchmark on the big-refactor branch. A vicuna-7b-v1.3 model, run in float 16, is loaded.
Update: the same problem occurs with
arc_challenge
.The logs are obtained by modifying
doc_to_choice()
inlm_eval/api/task.py
as follows.The error originally originated from
apply_template()
inlm_eval/utils.py
.Here I added some debugging info as follows.