huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
690 stars 78 forks source link

Cannot evaluate chat model on TruthfulQA (`TypeError: can only concatenate str (not "list") to str`) #66

Closed lewtun closed 7 months ago

lewtun commented 7 months ago

I am trying to evaluate a small Qwen model on TruthfulQA and am running the following command:

accelerate launch --multi_gpu --num_processes=8 scripts/evaluation/run_lighteval.py --tasks="lighteval|truthfulqa:mc|0|0" --output_dir "./scratch/evals" --model_args "pretrained=Qwen/Qwen1.5-0.5B-Chat" --override_batch_size 1 --use_chat_template

However, this throws the following error:

Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
Traceback (most recent call last):
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
WARNING:lighteval.logging.hierarchical_logger:    Running RequestType.LOGLIKELIHOOD requests
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000261]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:32.919941]
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
Traceback (most recent call last):
  File "/fsx/lewis/git/hf/h4/scripts/evaluation/run_lighteval.py", line 115, in <module>
    main(args)    
main(args)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
    main(args)
    main(args)  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper

    main(args)main(args)main(args)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
    return fn(*args, **kwargs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    return fn(*args, **kwargs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    return fn(*args, **kwargs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    return fn(*args, **kwargs)
          File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
return fn(*args, **kwargs)return fn(*args, **kwargs)

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    main(args)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/logging/hierarchical_logger.py", line 144, in wrapper
    return fn(*args, **kwargs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    evaluation_tracker = evaluate(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
    return fn(*args, **kwargs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/main_accelerate.py", line 91, in main
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    evaluation_tracker = evaluate(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
    evaluation_tracker = evaluate(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
            evaluation_tracker = evaluate(evaluation_tracker = evaluate(evaluation_tracker = evaluate(

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
    evaluation_tracker = evaluate(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
      File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)full_resps = lm.loglikelihood(requests, override_bs=override_bs)

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    evaluation_tracker = evaluate(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/evaluator.py", line 60, in evaluate
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
    full_resps = lm.loglikelihood(requests, override_bs=override_bs)
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/base_model.py", line 496, in loglikelihood
                request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(    request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(        request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(
request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(
request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair

  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
    request.tokenized_context, request.tokenized_continuation = self.tok_encode_pair(
  File "/fsx/lewis/miniconda3/envs/h4/lib/python3.10/site-packages/lighteval/models/abstract_model.py", line 146, in tok_encode_pair
        continuation = context[-n_spaces:] + continuationcontinuation = context[-n_spaces:] + continuation

TypeErrorcontinuation = context[-n_spaces:] + continuationTypeError        : 
:     continuation = context[-n_spaces:] + continuation    continuation = context[-n_spaces:] + continuationcan only concatenate str (not "list") to str
can only concatenate str (not "list") to strTypeErrorcontinuation = context[-n_spaces:] + continuation
continuation = context[-n_spaces:] + continuation

: TypeErrorcan only concatenate str (not "list") to str

TypeError: : 
TypeErrorcan only concatenate str (not "list") to strcan only concatenate str (not "list") to strTypeError: 

: can only concatenate str (not "list") to strcan only concatenate str (not "list") to str

    continuation = context[-n_spaces:] + continuation
TypeError: can only concatenate str (not "list") to str

Note there is no issue when the chat template is not activated