Closed IT-five closed 11 months ago
您好!baichuan的modeling
代码里目前似乎用xops.memory_efficient_attention
来启用flash-attention,不用flash-attention的话显存占用会随长度二次方增长,20k的长度下推理确实可能会OOM。建议您通过from xformers import ops as xops
来启用flash-attention。
您好!baichuan的
modeling
代码里目前似乎用xops.memory_efficient_attention
来启用flash-attention,不用flash-attention的话显存占用会随长度二次方增长,20k的长度下推理确实可能会OOM。建议您通过from xformers import ops as xops
来启用flash-attention。
感谢您的回答,可是我在如下环境中安装xformers一直会帮我重新安装torch,然后重装失败。具体应该怎么做?
bitsandbytes 0.41.1
open-clip-torch 2.20.0
peft 0.5.0
pytorch-lightning 1.7.7
pytorch-metric-learning 2.3.0
pytorch-wavelets 1.3.0
pytorch-wpe 0.0.1
pytorch3d 0.7.4
rotary-embedding-torch 0.3.0
sentencepiece 0.1.99
taming-transformers-rom1504 0.0.6
torch 2.0.1+cu118
torch-complex 0.4.3
torch-scatter 2.1.1
torchaudio 2.0.2+cu118
torchmetrics 0.11.4
torchsummary 1.5.1
torchvision 0.15.2+cu118
transformers 4.34.1
transformers-stream-generator 0.0.4
DEPRECATION: pytorch-lightning 1.7.7 has a non-standard dependency specifier torch>=1.9.*. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Installing collected packages: triton, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, torch, xformers
Attempting uninstall: triton
Found existing installation: triton 2.0.0
Uninstalling triton-2.0.0:
Successfully uninstalled triton-2.0.0
Attempting uninstall: torch
Found existing installation: torch 2.0.1+cu118
Uninstalling torch-2.0.1+cu118:
Successfully uninstalled torch-2.0.1+cu118
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fairseq 0.12.2 requires hydra-core<1.1,>=1.0.7, but you have hydra-core 1.3.2 which is incompatible.
fairseq 0.12.2 requires omegaconf<2.1, but you have omegaconf 2.3.0 which is incompatible.
fastai 2.7.12 requires torch<2.1,>=1.7, but you have torch 2.1.1 which is incompatible.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.1.1 which is incompatible.
torchvision 0.15.2+cu118 requires torch==2.0.1, but you have torch 2.1.1 which is incompatible.
wenetruntime 1.11.0 requires torch==1.11.0, but you have torch 2.1.1 which is incompatible.
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 torch-2.1.1 triton-2.1.0 xformers-0.0.23
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@dsw-5143-9979c9f66-knjrl:/mnt/workspace/extend_length/test# python
Python 3.8.16 (default, Jun 12 2023, 18:09:05)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available())
/opt/conda/lib/python3.8/site-packages/torch/cuda/__init__.py:138: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False
您遇到的似乎是一些package之间的冲突问题,您可以试试去xformer
的github下询问
我想比较直接外推、插值、NTK的效果,于是我在pred.py中将截断部分的逻辑删除,于是在seq_len=21000的长度时,我的A800报出了OOM的错误,这是正常的吗,请问应该怎么解决,我没用xformer