Open YcChou opened 3 months ago
Hi! Do you have an internet connection, and if so, does datasets.load_dataset("Rowan/hellaswag")
successfully load hellaswag if so?
I've had trouble reproducing issues like this in the past--but it seems to crop up when people are not able to connect to HF or perhaps a download failed.
Yes, later on when I configured the network, there were no more errors.
When I am evaluating few tasks, some of which (HellaSwag, PiQA, math_word_problems) encountered the following errors:
Traceback (most recent call last): File "/data//anaconda3/envs/eval/bin/lm_eval", line 8, in
sys.exit(cli_evaluate())
File "/data//projects/eval/lm-evaluation-harness/lm_eval/main.py", line 382, in cli_evaluate
results = evaluator.simple_evaluate(
File "/data//projects/eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
return fn(*args, *kwargs)
File "/data//projects/eval/lm-evaluation-harness/lm_eval/evaluator.py", line 227, in simple_evaluate
task_dict = get_task_dict(tasks, task_manager)
File "/data//projects/eval/lm-evaluation-harness/lm_eval/tasks/init.py", line 616, in get_task_dict
task_name_from_string_dict = task_manager.load_task_or_group(
File "/data//projects/eval/lm-evaluation-harness/lm_eval/tasks/init.py", line 410, in load_task_or_group
collections.ChainMap(map(self._load_individual_task_or_group, task_list))
File "/data//projects/eval/lm-evaluation-harness/lm_eval/tasks/init.py", line 310, in _load_individual_task_or_group
return _load_task(task_config, task=name_or_config)
File "/data//projects/eval/lm-evaluation-harness/lm_eval/tasks/init.py", line 276, in _load_task
task_object = ConfigurableTask(config=config)
File "/data//projects/eval/lm-evaluation-harness/lm_eval/api/task.py", line 822, in init
self.download(self.config.dataset_kwargs)
File "/data//projects/eval/lm-evaluation-harness/lm_eval/api/task.py", line 931, in download
self.dataset = datasets.load_dataset(
File "/data//anaconda3/envs/eval/lib/python3.10/site-packages/datasets/load.py", line 2519, in load_dataset
builder_instance = load_dataset_builder(
File "/data//anaconda3/envs/eval/lib/python3.10/site-packages/datasets/load.py", line 2192, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/data//anaconda3/envs/eval/lib/python3.10/site-packages/datasets/load.py", line 1843, in dataset_module_factory
raise e1 from None
File "/data//anaconda3/envs/eval/lib/python3.10/site-packages/datasets/load.py", line 1795, in dataset_module_factory
can_load_config_from_parquet_export = "DEFAULT_CONFIG_NAME" not in f.read()
File "/data//anaconda3/envs/eval/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
version: lm harness : Latest dataset: 2.20.0 (2.16.0 also failed)