declare-lab / instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
https://declare-lab.github.io/instruct-eval/
Apache License 2.0
528 stars 42 forks source link

Evaluate EncoderDecoderModels #32

Open Bachstelze opened 9 months ago

Bachstelze commented 9 months ago

There are few errors occurring. With instructionBERT: python main.py drop --model_name seq_to_seq --model_path Bachstelze/instructionBERT

Traceback (most recent call last): File "main.py", line 98, in Fire(main) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(varargs, kwargs) File "main.py", line 30, in main score = task_fn(kwargs) File "/home/hilsenbek/workspace/instruct-eval/mmlu.py", line 197, in main cors, acc, probs = evaluate(args, subject, model, dev_df, test_df) File "/home/hilsenbek/workspace/instruct-eval/mmlu.py", line 153, in evaluate pred = model.run(prompt) File "/home/hilsenbek/workspace/instruct-eval/modeling.py", line 158, in run outputs = self.model.generate( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 1267, in generate self._validate_model_kwargs(model_kwargs.copy()) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 1140, in _validate_model_kwargs raise ValueError( ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

With instructionRoBERTa for big bench hard and DROP: python main.py bbh --model_name seq_to_seq --model_path Bachstelze/instructionRoberta-base python main.py drop --model_name seq_to_seq --model_path Bachstelze/instructionRoberta-base

Traceback (most recent call last): File "main.py", line 98, in Fire(main) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "main.py", line 30, in main score = task_fn(kwargs) File "/home/hilsenbek/workspace/instruct-eval/bbh.py", line 82, in main data = BBHData.load_from_huggingface(config=name) File "/home/hilsenbek/workspace/instruct-eval/bbh.py", line 35, in load_from_huggingface data = load_dataset(path, config, split=split) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/datasets/load.py", line 1794, in load_dataset ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/datasets/builder.py", line 1089, in as_dataset raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).name} is not supported.") NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.

for MMLU: python main.py mmlu --model_name seq_to_seq --model_path Bachstelze/instructionRoberta-base

Traceback (most recent call last): File "main.py", line 98, in Fire(main) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(varargs, kwargs) File "main.py", line 30, in main score = task_fn(kwargs) File "/home/hilsenbek/workspace/instruct-eval/mmlu.py", line 197, in main cors, acc, probs = evaluate(args, subject, model, dev_df, test_df) File "/home/hilsenbek/workspace/instruct-eval/mmlu.py", line 153, in evaluate pred = model.run(prompt) File "/home/hilsenbek/workspace/instruct-eval/modeling.py", line 158, in run outputs = self.model.generate( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 1322, in generate model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation( File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 638, in _prepare_encoder_decoder_kwargs_for_generation model_kwargs["encoder_outputs"]: ModelOutput = encoder(encoder_kwargs) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/hilsenbek/.conda/envs/instruct-eval/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py", line 818, in forward buffered_token_type_ids_expanded = buffered_token_type_ids.expand(batch_size, seq_length) RuntimeError: The expanded size of the tensor (583) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 583]. Tensor sizes: [1, 514]