thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
MIT License
290 stars 26 forks source link

ValueError: Only supports llama, mistral and qwen2 models. #37

Open thistleknot opened 5 months ago

thistleknot commented 5 months ago
from inf_llm.utils import patch_hf
from transformers import AutoModel

def load_yaml_config(file_path='path_to_your_config_file.yaml'):
    """ Load a YAML configuration file. """
    with open(file_path, 'r') as file:
        return yaml.safe_load(file)

# Load the configuration for infinite context
config_path = 'minicpm-inf-llm.yaml'
with open(config_path, 'r') as file:
    inf_llm_config = yaml.safe_load(file)
inf_llm_config

from inf_llm.utils import patch_hf
config = load_yaml_config(file_path=config_path)['model']
model = patch_hf(model, config['type'], **config)

produces

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[26], line 3
      1 from inf_llm.utils import patch_hf
      2 config = load_yaml_config(file_path=config_path)['model']
----> 3 model = patch_hf(model, config['type'], **config)

File /home/user/mamba/InfLLM/inf_llm/utils/patch.py:150, in patch_hf(model, attn_type, attn_kwargs, base, distance_scale, **kwargs)
    148     Model = model.model.__class__
    149 else:
--> 150     raise ValueError("Only supports llama, mistral and qwen2 models.")
    152 hf_rope = model.model.layers[0].self_attn.rotary_emb 
    153 base = base if base is not None else hf_rope.base

ValueError: Only supports llama, mistral and qwen2 models.