open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

https://opencompass.org.cn/

Apache License 2.0

4.12k stars 437 forks source link

[Bug] HuggingFacewithChatTemplate #1557

Open ZCzzzzzz opened 1 month ago

ZCzzzzzz commented 1 month ago

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True, 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0', 'MMEngine': '0.10.4', 'MUSA available': False, 'NVCC': 'Not Available', 'OpenCV': '4.10.0', 'PyTorch': '2.1.0', 'Python': '3.10.12 (main, May 26 2024, 00:14:02) [GCC 9.4.0]', 'TorchVision': '0.16.0', 'lmdeploy': '0.2.6', 'numpy_random_seed': 2147483648, 'opencompass': '0.3.2.post1+ee058e2', 'sys.platform': 'linux', 'transformers': '4.44.2'}

Reproduces the problem - code/configuration sample

运行的命令是：python run.py configs/eval_hf_llama3-70b.py; 其中configs/eval_hf_llama3-70b.py内容为：”from mmengine.config import read_base

with read_base(): from .datasets.ceval.ceval_gen import ceval_datasets from .models.hf_llama.hf_llama3_70b_instruct_awq import models from .summarizers.example import summarizer

datasets = sum([v for k, v in locals().items() if k.endswith('_datasets') or k == 'datasets'], []) work_dir = './outputs/llama3/'“ hf_llama3_70b_instruct_awq.py的内容为：”from opencompass.models import HuggingFacewithChatTemplate

models = [ dict( type=HuggingFacewithChatTemplate, abbr='llama-3-70b-instruct-hf', path='/data/Llama-3-70B-Instruct-AWQ/', max_out_len=1024, batch_size=8, run_cfg=dict(num_gpus=8), stop_words=['<|end_of_text|>', '<|eot_id|>'], ) ]“

Reproduces the problem - command or script

无

Reproduces the problem - error message

09/23 20:31:34 - OpenCompass - INFO - Task [llama-3-70b-instruct-hf/ceval-computer_network,llama-3-70b-instruct-hf/ceval-operating_system,llama-3-70b-instruct-hf/ceval-computer_architecture,llama-3-70b-instruct-hf/ceval-college_programming,llama-3-70b-instruct-hf/ceval-college_physics,llama-3-70b-instruct-hf/ceval-college_chemistry,llama-3-70b-instruct-hf/ceval-advanced_mathematics,llama-3-70b-instruct-hf/ceval-probability_and_statistics,llama-3-70b-instruct-hf/ceval-discrete_mathematics,llama-3-70b-instruct-hf/ceval-electrical_engineer,llama-3-70b-instruct-hf/ceval-metrology_engineer,llama-3-70b-instruct-hf/ceval-high_school_mathematics,llama-3-70b-instruct-hf/ceval-high_school_physics,llama-3-70b-instruct-hf/ceval-high_school_chemistry,llama-3-70b-instruct-hf/ceval-high_school_biology,llama-3-70b-instruct-hf/ceval-middle_school_mathematics,llama-3-70b-instruct-hf/ceval-middle_school_biology,llama-3-70b-instruct-hf/ceval-middle_school_physics,llama-3-70b-instruct-hf/ceval-middle_school_chemistry,llama-3-70b-instruct-hf/ceval-veterinary_medicine,llama-3-70b-instruct-hf/ceval-college_economics,llama-3-70b-instruct-hf/ceval-business_administration,llama-3-70b-instruct-hf/ceval-marxism,llama-3-70b-instruct-hf/ceval-mao_zedong_thought,llama-3-70b-instruct-hf/ceval-education_science,llama-3-70b-instruct-hf/ceval-teacher_qualification,llama-3-70b-instruct-hf/ceval-high_school_politics,llama-3-70b-instruct-hf/ceval-high_school_geography,llama-3-70b-instruct-hf/ceval-middle_school_politics,llama-3-70b-instruct-hf/ceval-middle_school_geography,llama-3-70b-instruct-hf/ceval-modern_chinese_history,llama-3-70b-instruct-hf/ceval-ideological_and_moral_cultivation,llama-3-70b-instruct-hf/ceval-logic,llama-3-70b-instruct-hf/ceval-law,llama-3-70b-instruct-hf/ceval-chinese_language_and_literature,llama-3-70b-instruct-hf/ceval-art_studies,llama-3-70b-instruct-hf/ceval-professional_tour_guide,llama-3-70b-instruct-hf/ceval-legal_professional,llama-3-70b-instruct-hf/ceval-high_school_chinese,llama-3-70b-instruct-hf/ceval-high_school_history,llama-3-70b-instruct-hf/ceval-middle_school_history,llama-3-70b-instruct-hf/ceval-civil_servant,llama-3-70b-instruct-hf/ceval-sports_science,llama-3-70b-instruct-hf/ceval-plant_protection,llama-3-70b-instruct-hf/ceval-basic_medicine,llama-3-70b-instruct-hf/ceval-clinical_medicine,llama-3-70b-instruct-hf/ceval-urban_and_rural_planner,llama-3-70b-instruct-hf/ceval-accountant,llama-3-70b-instruct-hf/ceval-fire_engineer,llama-3-70b-instruct-hf/ceval-environmental_impact_assessment_engineer,llama-3-70b-instruct-hf/ceval-tax_accountant,llama-3-70b-instruct-hf/ceval-physician] You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. /usr/local/lib/python3.10/site-packages/awq/modules/linear/exllama.py:12: UserWarning: AutoAWQ could not load ExLlama kernels extension. Details: No module named 'exl_ext' warnings.warn(f"AutoAWQ could not load ExLlama kernels extension. Details: {ex}") /usr/local/lib/python3.10/site-packages/awq/modules/linear/exllamav2.py:13: UserWarning: AutoAWQ could not load ExLlamaV2 kernels extension. Details: No module named 'exlv2_ext' warnings.warn(f"AutoAWQ could not load ExLlamaV2 kernels extension. Details: {ex}") /usr/local/lib/python3.10/site-packages/awq/modules/linear/gemm.py:14: UserWarning: AutoAWQ could not load GEMM kernels extension. Details: No module named 'awq_ext' warnings.warn(f"AutoAWQ could not load GEMM kernels extension. Details: {ex}") /usr/local/lib/python3.10/site-packages/awq/modules/linear/gemv.py:11: UserWarning: AutoAWQ could not load GEMV kernels extension. Details: No module named 'awq_ext' warnings.warn(f"AutoAWQ could not load GEMV kernels extension. Details: {ex}") /usr/local/lib/python3.10/site-packages/awq/modules/linear/gemv_fast.py:10: UserWarning: AutoAWQ could not load GEMVFast kernels extension. Details: No module named 'awq_v2_ext' warnings.warn(f"AutoAWQ could not load GEMVFast kernels extension. Details: {ex}") Loading checkpoint shards: 100%|██████████| 9/9 [00:10<00:00, 1.17s/it] 09/23 20:31:55 - OpenCompass - INFO - using stop words: ['<|eot_id|>', '<|end_of_text|>'] 09/23 20:31:55 - OpenCompass - INFO - Start inferencing [llama-3-70b-instruct-hf/ceval-computer_network] 100%|██████████| 19/19 [00:00<00:00, 463324.28it/s] [2024-09-23 20:31:56,017] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting build dataloader [2024-09-23 20:31:56,017] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process... 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 161, in inferencer.run() File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 89, in run self._inference() File "/workspace/opencompass/opencompass/tasks/openicl_infer.py", line 139, in _inference inferencer.inference(retriever, File "/workspace/opencompass/opencompass/openicl/icl_inferencer/icl_gen_inferencer.py", line 153, in inference results = self.model.generate_from_template( File "/workspace/opencompass/opencompass/models/base.py", line 201, in generate_from_template return self.generate(inputs, max_out_len=max_out_len, *kwargs) File "/workspace/opencompass/opencompass/models/huggingface_above_v4_33.py", line 440, in generate messages = [self.tokenizer.apply_chat_template(m, add_generation_prompt=True, tokenize=False) for m in messages] File "/workspace/opencompass/opencompass/models/huggingface_above_v4_33.py", line 440, in messages = [self.tokenizer.apply_chat_template(m, add_generation_prompt=True, tokenize=False) for m in messages] File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1786, in apply_chat_template chat_template = self.get_chat_template(chat_template, tools) File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2025, in get_chat_template raise ValueError( ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating [2024-09-23 20:32:05,799] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 299549) of binary: /usr/local/bin/python3.10 Traceback (most recent call last): File "/usr/local/bin/torchrun", line 8, in sys.exit(main()) File "/usr/local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(args, kwargs) File "/usr/local/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main run(args) File "/usr/local/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run elastic_launch( File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call** return launch_agent(self._config, self._entrypoint, list(args)) File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/workspace/opencompass/opencompass/tasks/openicl_infer.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-09-23_20:32:05 host : localhost.localdomain rank : 0 (local_rank: 0) exitcode : 1 (pid: 299549) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ ### Other information _No response_

tonysy commented 1 month ago

Could you please ensure that the model can be loaded successfully directly using transformers?

daidaiershidi commented 1 month ago

Could you please ensure that the model can be loaded successfully directly using transformers?

这个错误时因为transformers在4.43后取消了默认的chat template，但是opencompass之前很多测试似乎都是用了默认的chat template？之后会在代码中添加这一块吗