Open AatroxZZ opened 7 months ago
What is your error?
What is your error?
Traceback (most recent call last):
File "/mnt/data/users/zxb/EasyContext/train.py", line 11, in
the package version is Package Version
accelerate 0.28.0 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.2.0 certifi 2024.2.2 charset-normalizer 3.3.2 click 8.1.7 contourpy 1.2.1 cycler 0.12.1 datasets 2.18.0 deepspeed 0.14.0 dill 0.3.8 docker-pycreds 0.4.0 einops 0.7.0 evaluate 0.4.1 filelock 3.13.1 flash-attn 2.5.6 fonttools 4.51.0 frozenlist 1.4.1 fsspec 2024.2.0 gitdb 4.0.11 GitPython 3.1.43 hjson 3.1.0 huggingface-hub 0.22.2 idna 3.6 Jinja2 3.1.3 joblib 1.3.2 kiwisolver 1.4.5 MarkupSafe 2.1.3 matplotlib 3.8.4 mpmath 1.2.1 multidict 6.0.5 multiprocess 0.70.16 networkx 3.2.1 ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.1.105 nvidia-nvtx-cu12 12.1.105 packaging 24.0 pandas 2.2.1 pillow 10.3.0 pip 23.3.1 protobuf 4.25.3 psutil 5.9.8 py-cpuinfo 9.0.0 pyarrow 15.0.2 pyarrow-hotfix 0.6 pydantic 2.6.4 pydantic_core 2.16.3 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 pytorch-triton 3.0.0+989adb9a29 pytz 2024.1 PyYAML 6.0.1 quanto 0.1.0 regex 2023.12.25 requests 2.31.0 responses 0.18.0 ring-flash-attn 0.1 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.13.0 seaborn 0.13.2 sentencepiece 0.2.0 sentry-sdk 1.44.1 setproctitle 1.3.3 setuptools 68.2.2 six 1.16.0 smmap 5.0.1 sympy 1.12 threadpoolctl 3.4.0 tokenizers 0.15.2 torch 2.4.0.dev20240324+cu121 tqdm 4.66.2 transformers 4.39.1 typing_extensions 4.8.0 tzdata 2024.1 urllib3 2.2.1 wandb 0.16.6 wheel 0.41.2 xxhash 3.4.1 yarl 1.9.4
Your flash attention is not installed correctly. You can try to compile it from source:
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention
python setup.py install
Your flash attention is not installed correctly. You can try to compile it from source:
git clone https://github.com/Dao-AILab/flash-attention cd flash-attention python setup.py install
Still not working...
Your flash attention is not installed correctly. You can try to compile it from source:
git clone https://github.com/Dao-AILab/flash-attention cd flash-attention python setup.py install
Still not working...
这个是否是版本的问题,请问作者这个flash-attn的版本是多少呢。我尝试了许多不同版本的torch和flash-atten的组合,但是都失败了,并且用源码编译运行仍然会导致上述的错误
我的是2.5.6
我尝试了许多不同版本的torch和flash-atten的组合,但是都失败了,并且用源码编译运行仍然会导致上述的错误
不太清楚了, 我没有用docker。
Your flash attention is not installed correctly. You can try to compile it from source:
git clone https://github.com/Dao-AILab/flash-attention cd flash-attention python setup.py install
Still not working...
这个是否是版本的问题,请问作者这个flash-attn的版本是多少呢。我尝试了许多不同版本的torch和flash-atten的组合,但是都失败了,并且用源码编译运行仍然会导致上述的错误
accelerate 0.29.1 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 anyio 4.3.0 appdirs 1.4.4 archspec 0.2.3 async-timeout 4.0.3 attrs 23.2.0 beautifulsoup4 4.12.3 boltons 23.1.1 Brotli 1.1.0 cachetools 5.3.3 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 conda 24.1.2 conda-libmamba-solver 24.1.0 conda-package-handling 2.2.0 conda_package_streaming 0.9.0 contourpy 1.2.1 cycler 0.12.1 datasets 2.17.1.dev0 deepspeed 0.14.0 dill 0.3.8 distro 1.9.0 docker-pycreds 0.4.0 einops 0.7.0 evaluate 0.4.1 exceptiongroup 1.2.0 fastapi 0.110.1 filelock 3.13.1 flash_attn 2.5.6 fonttools 4.51.0 frozenlist 1.4.1 fsspec 2024.2.0 gitdb 4.0.11 GitPython 3.1.43 h11 0.14.0 hjson 3.1.0 httptools 0.6.1 huggingface-hub 0.22.2 idna 3.6 iniconfig 2.0.0 Jinja2 3.1.3 joblib 1.3.2 jsonpatch 1.33 jsonpointer 2.4 kiwisolver 1.4.5 libmambapy 1.5.7 loguru 0.7.2 mamba 1.5.7 MarkupSafe 2.1.3 matplotlib 3.8.4 menuinst 2.0.2 mpmath 1.2.1 multidict 6.0.5 multiprocess 0.70.16 networkx 3.2.1 ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.535.133 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.1.105 nvidia-nvtx-cu12 12.1.105 nvitop 1.3.2 packaging 24.0 pandas 2.2.1 pillow 10.3.0 pip 24.0 platformdirs 4.2.0 pluggy 1.4.0 protobuf 4.25.3 psutil 5.9.8 py-cpuinfo 9.0.0 pyarrow 15.0.2 pyarrow-hotfix 0.6 pycosat 0.6.6 pycparser 2.21 pydantic 2.6.4 pydantic_core 2.16.3 pynvml 11.5.0 pyparsing 3.1.2 PySocks 1.7.1 pytest 8.1.1 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 pytorch-triton 3.0.0+989adb9a29 pytz 2024.1 PyYAML 6.0.1 quanto 0.1.0 regex 2023.12.25 requests 2.31.0 responses 0.18.0 ring_flash_attn 0.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.13.0 seaborn 0.13.2 sentencepiece 0.2.0 sentry-sdk 1.44.1 setproctitle 1.3.3 setuptools 69.2.0 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 soupsieve 2.5 starlette 0.37.2 sympy 1.12 termcolor 2.4.0 threadpoolctl 3.4.0 tokenizers 0.15.2 tomli 2.0.1 torch 2.4.0.dev20240324+cu121 tqdm 4.66.2 transformers 4.39.1 truststore 0.8.0 typing_extensions 4.8.0 tzdata 2024.1 urllib3 2.2.1 uvicorn 0.29.0 uvloop 0.19.0 wandb 0.16.6 watchfiles 0.21.0 websockets 12.0 wheel 0.43.0 xxhash 3.4.1 yarl 1.9.4 zstandard 0.22.0
Thanks,I can train with image pytorch:22.12-py3,I think the CUDA version(11.8) need to correspond to the torch version(2.4.0.dev20240324+cu118).
I want to ask which image is used for this job, I can't run train.sh after I complete the Installation using pytorch:23.06 following the steps prompted by installation