I followed the steps in the README and copied the 3 modeling files modeling_mistral.py, modeling_utils.py and configuration_mistral.py into my transformers folders:
Target Folders for changed files:lib/python3.11/site-packages/transformers//lib64/python3.11/site-packages/transformers
Clone the repository to your local machine and copy the modeling files into transformers/src/transformers/models/mistral
When initializing the weights specify the self_extend attention mechanism as such:
model = MistralForCausalLM.from_pretrained("hf_mistral-7B-v0.1", attn_implementation="self_extend")
Running the model results in the following error:
lib64/python3.11/site-packages/transformers/modeling_utils.py", line 1491, in _check_and_enable_sdpa
raise ValueError(
ValueError: MistralForCausalLM does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new
I followed the steps in the README and copied the 3 modeling files
modeling_mistral.py
,modeling_utils.py
andconfiguration_mistral.py
into my transformers folders:Target Folders for changed files:
lib/python3.11/site-packages/transformers/
/lib64/python3.11/site-packages/transformers
Running the model results in the following error:
Versions: