lyogavin / airllm

AirLLM 70B inference with single 4GB GPU
Apache License 2.0
5.09k stars 408 forks source link

AttributeError: 'AirLLMLlama2' object has no attribute '_supports_cache_class' #156

Open Source61 opened 3 months ago

Source61 commented 3 months ago

Model: WizardLMTeam/WizardCoder-33B-V1.1

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa... attn imp: <class 'transformers.models.llama.modeling_llama.LlamaSdpaAttention'> The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:32014 for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Traceback (most recent call last): File "/home/john/dev/AI/./test.py", line 24, in generation_output = model.generate( ^^^^^^^^^^^^^^^ File "/home/john/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/john/.local/lib/python3.11/site-packages/transformers/generation/utils.py", line 1777, in generate elif generation_config.cache_implementation is None and self._supports_default_dynamic_cache(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/john/.local/lib/python3.11/site-packages/transformers/generation/utils.py", line 1454, in _supports_default_dynamic_cache return self._supports_cache_class and "jamba" not in self.class.name.lower() ^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'AirLLMLlama2' object has no attribute '_supports_cache_class'

RhizomaticRobin commented 3 months ago

Source61, I got Llama3 to work by changing the utils.py source code to remove the non-existent attribute, and default it to a set boolean value of False:

Change these two functions for class GenerationMixin in /home/john/.local/lib/python3.11/site-packages/transformers/generation/utils.py to:

def _supports_default_dynamic_cache(self) -> bool:
    return False and "jamba" not in self.__class__.__name__.lower() # replace, self._supports_cache_class = False

Change this as well:

def _validate_model_kwargs(self, model_kwargs: Dict[str, Any]):
        if isinstance(model_kwargs.get("past_key_values", None), Cache) and not False: # replace, self._supports_cache_class = False
            raise ValueError(
                f"{self.__class__.__name__} does not support an instance of `Cache` as `past_key_values`. Please "
                "check the model documentation for supported cache formats."
            )

If you get an error again, try setting it to true if the missing attribute for _supports_cache_class was actually meant to be set to True for model WizardLMTeam/WizardCoder-33B-V1.1.

Sikander30 commented 3 months ago

Same problem with model: v2ray/Llama-3-70B