OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Apache License 2.0
1.15k stars 75 forks source link

ModuleNotFoundError: No module named 'dropout_layer_norm' #102

Open hzlcodus opened 3 months ago

hzlcodus commented 3 months ago

Error occurred while running demo.ipynb in InternVideo2's multi_modality demo. I installed packages according to requirements.txt.

ModuleNotFoundError                       
Traceback (most recent call last)
Cell In[1], line 11
      6 import torch
      8 from config import (Config,
      9                     eval_dict_leaf)
---> 11 from utils import (retrieve_text,
     12                   _frame_from_video,
     13                   setup_internvideo2)

File [~/rl-gnrl/InternVideo/InternVideo2/multi_modality/demo/utils.py:17](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/rl-gnrl/InternVideo/InternVideo2/multi_modality/demo/utils.py#line=16)
     13 parent_dir = os.path.dirname(current_dir)
     14 sys.path.append(parent_dir)
---> 17 from models.backbones.internvideo2 import pretrain_internvideo2_1b_patch14_224
     18 from models.backbones.bert.builder import build_bert
     19 from models.criterions import get_sim

File [~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/__init__.py:1](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/__init__.py#line=0)
----> 1 from .internvideo2_clip import InternVideo2_CLIP
      2 from .internvideo2_stage2 import InternVideo2_Stage2
      3 # from .internvideo2_stage2_audio import InternVideo2_Stage2_audio

File [~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/internvideo2_clip.py:10](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/internvideo2_clip.py#line=9)
      7 import torchvision.transforms as transforms
      8 from torchvision.transforms import InterpolationMode
---> 10 from .backbones.internvideo2 import InternVideo2, LLaMA, Tokenizer
     11 from .criterions import VTC_VTM_Loss
     13 logger = logging.getLogger(__name__)

File [~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/__init__.py:1](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/__init__.py#line=0)
----> 1 from .internvl_clip_vision import internvl_clip_6b
      2 from .internvideo2 import pretrain_internvideo2_1b_patch14_224, pretrain_internvideo2_6b_patch14_224
      3 from .internvideo2_clip_vision import InternVideo2

File [~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/internvl_clip_vision.py:16](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/rl-gnrl/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/internvl_clip_vision.py#line=15)
     14     from flash_attention_class import FlashAttention
     15 from flash_attn.modules.mlp import FusedMLP
---> 16 from flash_attn.ops.rms_norm import DropoutAddRMSNorm
     19 MODEL_PATH = 'your_model_path[/internvl](http://127.0.0.1:8888/internvl)'
     20 _MODELS = {
     21     # see InternVL
     22     "internvl_c_13b_224px": os.path.join(MODEL_PATH, "internvl_c_13b_224px.pth"),
     23 }

File [~/miniconda3/envs/internvid/lib/python3.10/site-packages/flash_attn/ops/rms_norm.py:7](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/miniconda3/envs/internvid/lib/python3.10/site-packages/flash_attn/ops/rms_norm.py#line=6)
      4 import torch
      5 from torch.nn import init
----> 7 from flash_attn.ops.layer_norm import DropoutAddLayerNormFn, DropoutAddLayerNormSubsetFn
      8 from flash_attn.ops.layer_norm import DropoutAddLayerNormParallelResidualFn
     11 def rms_norm(x, weight, epsilon):

File [~/miniconda3/envs/internvid/lib/python3.10/site-packages/flash_attn/ops/layer_norm.py:7](http://127.0.0.1:8888/lab/workspaces/auto-m/tree/multi_modality/demo/~/miniconda3/envs/internvid/lib/python3.10/site-packages/flash_attn/ops/layer_norm.py#line=6)
      4 import torch
      5 from torch.nn import init
----> 7 import dropout_layer_norm
     10 def maybe_align(x, alignment_in_bytes=16):
     11     """Assume that x already has last dim divisible by alignment_in_bytes
     12     """

ModuleNotFoundError: No module named 'dropout_layer_norm'
shepnerd commented 3 months ago

This is caused by the missing installation of some libs given in flash attention. You need to get the source code of flash attention, and then install layer_norm as in https://github.com/Dao-AILab/flash-attention/blob/main/csrc/layer_norm/README.md and fused_mlp as in https://github.com/Dao-AILab/flash-attention/blob/main/csrc/fused_dense_lib/README.md.

We will update installation doc soon.

shepnerd commented 3 months ago

If your machine did not support the installation of these libs, you could alter the settings in config.py that does not use half precision and bf16 for running. In that case, the code would use naive attention impl instead of flash attention.

raviy0807 commented 3 months ago

@shepnerd can you please specify where to make changes ? my hardware does not support flas attention, also i just wanted to test inference of the model from the demo notebook

shepnerd commented 3 months ago

You can refer to this instruction to install dependencies to run flash-attn with layernorm and other components.

If your hardware does not support flash-attn and its dependenies installation, you can try common attention by setting using full-precision compute in config.py for bypassing it. Taking internvideo2_stage2_config.py as an example, you need to set the following variables to False.

use_half_precision = False
use_bf16 = False
use_flash_attn=False,
use_fused_rmsnorm=False,
use_fused_mlp=False,
chinmayad commented 2 months ago

I am still having issues despite installing flash-attn and changing the config.py.

ModuleNotFoundError: No module named 'dropout_layer_norm' when

ModuleNotFoundError Traceback (most recent call last) Code/InternVideo/InternVideo2/multi_modality/demo.ipynb Cell 1 line 1 6 import torch 8 from config import (Config, 9 eval_dict_leaf) ---> 11 from utils import (retrieve_text, 12 _frame_from_video, 13 setup_internvideo2)

File //Code/InternVideo/InternVideo2/multi_modality/utils.py:9 6 import torch 7 from torch import nn ----> 9 from models.backbones.internvideo2 import pretrain_internvideo2_1b_patch14_224 10 from models.backbones.bert.builder import build_bert 11 from models.criterions import get_sim

File /Code/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/init.py:1 ----> 1 from .internvl_clip_vision import internvl_clip_6b 2 from .internvideo2 import pretrain_internvideo2_1b_patch14_224, pretrain_internvideo2_6b_patch14_224 3 from .internvideo2_clip_vision import InternVideo2

File /Code/InternVideo/InternVideo2/multi_modality/models/backbones/internvideo2/internvl_clip_vision.py:16 14 from flash_attention_class import FlashAttention 15 from flash_attn.modules.mlp import FusedMLP ... 10 def maybe_align(x, alignment_in_bytes=16): 11 """Assume that x already has last dim divisible by alignment_in_bytes 12 """

ModuleNotFoundError: No module named 'dropout_layer_norm'