FMInference / H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
369 stars 37 forks source link

Package Error when I reproduce as https://github.com/FMInference/H2O/tree/main/h2o_hf #19

Open KylinC opened 8 months ago

KylinC commented 8 months ago

some issues occur when I run 'bash scripts/streaming/eval.sh h2o'

USER: Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.

ASSISTANT: Traceback (most recent call last): File "/hetu_group/chenqilin/H2O/h2o_hf/run_streaming.py", line 150, in main(args) File "/hetu_group/chenqilin/H2O/h2o_hf/run_streaming.py", line 121, in main streaming_inference_heavy_hitter( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "/hetu_group/chenqilin/H2O/h2o_hf/run_streaming.py", line 96, in streaming_inference_heavy_hitter past_key_values = greedy_generate( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/hetu_group/chenqilin/H2O/h2o_hf/run_streaming.py", line 23, in greedy_generate outputs = model( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward outputs = self.model( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1070, in forward layer_outputs = decoder_layer( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 798, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/hetu_group/chenqilin/H2O/h2o_hf/utils_real_drop/modify_llama.py", line 678, in forward past_key_values_length=past_key_value[0].shape[-2] if past_key_value is not None else 0, File "/hetu_group/chenqilin/python_envs/h2o2/lib/python3.10/site-packages/transformers/cache_utils.py", line 78, in getitem raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}") KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'

HamidShojanazeri commented 8 months ago

I am getting the same issue as well.

Kyriection commented 8 months ago

Hi, that might result from the version of transformers. Current code is based on version=4.31.0. We will modify the code to support the latest transformers version, will release the code shortly.

yjsunn commented 5 months ago

Hi, that might result from the version of transformers. Current code is based on version=4.31.0. We will modify the code to support the latest transformers version, will release the code shortly.

hi may i ask the version of crfm-helm and lm-eval current code is based on?

yjsunn commented 5 months ago

Hi, that might result from the version of transformers. Current code is based on version=4.31.0. We will modify the code to support the latest transformers version, will release the code shortly.

it seems like the crfm-helm==0.5.0 only support transformer>4.37 and crfm-helm==0.4.0 only support 4.33.3. And the lm_eval seems not able to load dataset when transformer==4.33.3

Traceback (most recent call last): File "/H2O/h2o_hf/generate_task_data.py", line 58, in <module> results = evaluator.evaluate(adaptor, tasks.get_task_dict([args.task_name]), False, args.num_fewshot, None) File "/opt/conda/lib/python3.10/site-packages/lm_eval/tasks/__init__.py", line 317, in get_task_dict task_name_dict = { File "/opt/conda/lib/python3.10/site-packages/lm_eval/tasks/__init__.py", line 318, in <dictcomp> task_name: get_task(task_name)() File "/opt/conda/lib/python3.10/site-packages/lm_eval/base.py", line 412, in __init__ self.download(data_dir, cache_dir, download_mode) File "/opt/conda/lib/python3.10/site-packages/lm_eval/base.py", line 441, in download self.dataset = datasets.load_dataset( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1670, in load_dataset builder_instance = load_dataset_builder( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1447, in load_dataset_builder dataset_module = dataset_module_factory( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1172, in dataset_module_factory raise e1 from None File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 1151, in dataset_module_factory return HubDatasetModuleFactoryWithoutScript( File "/opt/conda/lib/python3.10/site-packages/datasets/load.py", line 744, in __init__ assert self.name.count("/") == 1 AssertionError