unslothai / unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
15.16k stars 1.01k forks source link

ModuleNotFoundError: No module named 'torch.nn.attention' #778

Open WasamiKirua opened 1 month ago

WasamiKirua commented 1 month ago

I usually train models using instances on Vast.ai. my proess did not change I am used to instantiate instances with Torch 2.1.1 and/or 2.2.0 and CUDA 12.1. I am using an RTX 3090

As always i run in this case torch 2.1.1 the followinng:

!pip install --upgrade --force-reinstall --no-cache-dir torch==2.1.1 triton \ --index-url https://download.pytorch.org/whl/cu121

!pip install "unsloth[cu121-ampere-torch211] @ git+https://github.com/unslothai/unsloth.git"

no issue or error during the requirements installation. Immediatelly after i run the first cell of the notebook (Gemma2):

from unsloth import FastLanguageModel
  File "/opt/conda/lib/python3.10/site-packages/unsloth/__init__.py", line 159, in <module>
    from .models import *
  File "/opt/conda/lib/python3.10/site-packages/unsloth/models/__init__.py", line 15, in <module>
    from .loader  import FastLanguageModel
  File "/opt/conda/lib/python3.10/site-packages/unsloth/models/loader.py", line 15, in <module>
    from .llama import FastLlamaModel, logger
  File "/opt/conda/lib/python3.10/site-packages/unsloth/models/llama.py", line 29, in <module>
    from ..kernels import *
  File "/opt/conda/lib/python3.10/site-packages/unsloth/kernels/__init__.py", line 36, in <module>
    from .flex_attention import HAS_FLEX_ATTENTION, slow_attention_softcapping
  File "/opt/conda/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py", line 28, in <module>
    import torch.nn.attention
ModuleNotFoundError: No module named 'torch.nn.attention'

I have tried at least 5 different instances, as I said I successfully run the notebook a couple of days ago. I did not went through the recent commit but can someone help ?

many thanks

WasamiKirua commented 1 month ago

Just tried the same locally (RTX 3090) (conda install) and i am getting the same error:


---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], [line 1](vscode-notebook-cell:?execution_count=1&line=1)
----> [1](vscode-notebook-cell:?execution_count=1&line=1) from unsloth import FastLanguageModel
      [2](vscode-notebook-cell:?execution_count=1&line=2) import torch
      [3](vscode-notebook-cell:?execution_count=1&line=3) max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:159
    [149](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:149)         warnings.warn(
    [150](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:150)             "Unsloth: CUDA is not linked properly.\n"\
    [151](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:151)             "Try running `python -m bitsandbytes` then `python -m xformers.info`\n"\
   (...)
    [155](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:155)             "Unsloth will still run for now, but maybe it might crash - let's hope it works!"
    [156](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:156)         )
    [157](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:157) pass
--> [159](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:159) from .models import *
    [160](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:160) from .save import *
    [161](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:161) from .chat_templates import *

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:[1](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:1)5
      1 # Copyright 2023-present Daniel Han-Chen & the Unsloth team. All rights reserved.
      [2](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:2) #
      [3](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:3) # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     [12](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:12) # See the License for the specific language governing permissions and
     [13](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:13) # limitations under the License.
---> [15](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:15) from .loader  import FastLanguageModel
     [16](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:16) from .llama   import FastLlamaModel
     [17](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py:17) from .mistral import FastMistralModel

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:[1](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:1)5
      1 # Copyright 2023-present Daniel Han-Chen & the Unsloth team. All rights reserved.
      [2](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:2) #
      [3](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:3) # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     [12](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:12) # See the License for the specific language governing permissions and
     [13](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:13) # limitations under the License.
---> [15](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:15) from .llama import FastLlamaModel, logger
     [16](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:16) from .mistral import FastMistralModel
     [17](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py:17) from .qwen2 import FastQwen2Model

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:29
     [21](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:21) from transformers.models.llama.modeling_llama import (
     [22](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:22)     logger,
     [23](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:23)     BaseModelOutputWithPast,
     [24](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:24)     CausalLMOutputWithPast,
     [25](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:25) )
     [26](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:26) from transformers.modeling_attn_mask_utils import (
     [27](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:27)     _prepare_4d_causal_attention_mask_for_sdpa,
     [28](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:28) )
---> [29](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:29) from ..kernels import *
     [30](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:30) from ..tokenizer_utils import *
     [31](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py:31) if HAS_FLASH_ATTENTION:

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:36
     [25](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:25) from .fast_lora import (
     [26](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:26)     get_lora_parameters,
     [27](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:27)     get_lora_parameters_bias,
   (...)
     [32](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:32)     apply_lora_o,
     [33](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:33) )
     [34](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:34) from .utils import fast_dequantize, fast_gemv, QUANT_STATE, fast_linear_forward, matmul_lora
---> [36](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:36) from .flex_attention import HAS_FLEX_ATTENTION, slow_attention_softcapping
     [38](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:38) if HAS_FLEX_ATTENTION:
     [39](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:39)     from .flex_attention import (
     [40](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:40)         FLEX_ATTENTION_PADDING,
     [41](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py:41)     )

File ~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:28
     [19](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:19) torch_compile_options = {
     [20](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:20)     "epilogue_fusion"   : True,
     [21](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:21)     "max_autotune"      : True,
   (...)
     [24](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:24)     "triton.cudagraphs" : False,
     [25](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:25) }
     [27](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:27) # Flex Attention supported from torch 2.5 onwards only
---> [28](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:28) import torch.nn.attention
     [29](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:29) if hasattr(torch.nn.attention, "flex_attention"):
     [30](https://file+.vscode-resource.vscode-cdn.net/home/wasami/Public/Gemma2-9b/~/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/flex_attention.py:30)     import torch.nn.attention.flex_attention

ModuleNotFoundError: No module named 'torch.nn.attention'
danielhanchen commented 1 month ago

Apologies just fixed!