lyogavin / airllm

AirLLM 70B inference with single 4GB GPU
Apache License 2.0
4.04k stars 334 forks source link

I can’t run llama-3.1-405B-Instruct-bnb-4bit because of a ValueError: rope_scaling must be a dictionary with two fields. #159

Open LCG22 opened 1 month ago

LCG22 commented 1 month ago

my environment info: GPU P100 of kaggle

This is my code: from airllm import AutoModel model = AutoModel.from_pretrained("unsloth/Meta-Llama-3.1-405B-Instruct-bnb-4bit")

This is error info:

ValueError Traceback (most recent call last) Cell In[3], line 1 ----> 1 model = AutoModel.from_pretrained("unsloth/Meta-Llama-3.1-405B-Instruct-bnb-4bit")

File /opt/conda/lib/python3.10/site-packages/airllm/auto_model.py:51, in AutoModel.from_pretrained(cls, pretrained_model_name_or_path, *inputs, kwargs) 48 if is_on_mac_os: 49 return AirLLMLlamaMlx(pretrained_model_name_or_path, *inputs, * kwargs) ---> 51 module, cls = AutoModel.get_module_class(pretrained_model_name_or_path, inputs, kwargs) 52 module = importlib.importmodule(module) 53 class = getattr(module, cls)

File /opt/conda/lib/python3.10/site-packages/airllm/auto_model.py:25, in AutoModel.get_module_class(cls, pretrained_model_name_or_path, *inputs, **kwargs) 23 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True, token=kwargs['hf_token']) 24 else: ---> 25 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True) 27 if "QWen" in config.architectures[0]: 28 return "airllm", "AirLLMQWen"

File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:989, in AutoConfig.from_pretrained(cls, pretrained_model_name_or_path, kwargs) 983 except KeyError: 984 raise ValueError( 985 f"The checkpoint you are trying to load has model type {config_dict['model_type']} " 986 "but Transformers does not recognize this architecture. This could be because of an " 987 "issue with the checkpoint, or because your version of Transformers is out of date." 988 ) --> 989 return config_class.from_dict(config_dict, unused_kwargs) 990 else: 991 # Fallback: use pattern matching on the string. 992 # We go from longer names to shorter names to catch roberta before bert (for instance) 993 for pattern in sorted(CONFIG_MAPPING.keys(), key=len, reverse=True):

File /opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py:772, in PretrainedConfig.from_dict(cls, config_dict, kwargs) 769 # We remove it from kwargs so that it does not appear in return_unused_kwargs. 770 config_dict["attn_implementation"] = kwargs.pop("attn_implementation", None) --> 772 config = cls(config_dict) 774 if hasattr(config, "pruned_heads"): 775 config.pruned_heads = {int(key): value for key, value in config.pruned_heads.items()}

File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py:161, in LlamaConfig.init(self, vocab_size, hidden_size, intermediate_size, num_hidden_layers, num_attention_heads, num_key_value_heads, hidden_act, max_position_embeddings, initializer_range, rms_norm_eps, use_cache, pad_token_id, bos_token_id, eos_token_id, pretraining_tp, tie_word_embeddings, rope_theta, rope_scaling, attention_bias, attention_dropout, mlp_bias, **kwargs) 159 self.rope_theta = rope_theta 160 self.rope_scaling = rope_scaling --> 161 self._rope_scaling_validation() 162 self.attention_bias = attention_bias 163 self.attention_dropout = attention_dropout

File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py:182, in LlamaConfig._rope_scaling_validation(self) 179 return 181 if not isinstance(self.rope_scaling, dict) or len(self.rope_scaling) != 2: --> 182 raise ValueError( 183 "rope_scaling must be a dictionary with two fields, type and factor, " f"got {self.rope_scaling}" 184 ) 185 rope_scaling_type = self.rope_scaling.get("type", None) 186 rope_scaling_factor = self.rope_scaling.get("factor", None)

ValueError: rope_scaling must be a dictionary with two fields, type and factor, got {'factor': 8.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

191611 commented 1 month ago

作者在405B的example里有写:

if you see Errors like: ValueError: rope_scaling must be a dictionary with two fields, type and factor need to upgrade transformers to >= 4.43.0

pip install transformers==4.43.3