NX-AI / xlstm

Official repository of the xLSTM.
GNU Affero General Public License v3.0
918 stars 66 forks source link

RuntimeError: Error building extension 'slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0' #25

Open God-YYB opened 2 weeks ago

God-YYB commented 2 weeks ago

This problem shows after i solved "RuntimeError: Ninja is required to load C++ extensions" by "pip3 install Ninja"

miaozhixu commented 2 weeks ago

Would you please paste the log of your output? I face the same problem on windows 11, it's caused by the CUDA libraries, because Ninja assemble the nvcc command line incorrectly, leading the program cannot find CUDA libraries. I guess it was the space char in path string. But on ubuntu 22..04, with NVidia driver 535, CUDA 12.1, cudnn 9, pytorch 2.3.1, and Ninja installed, it has no problem. PS: the new version of xlstm 1.0.4 has something wrong in slstm layer src, you should try 1.0.3 on Ubuntu.

Marco-Nguyen commented 1 week ago

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

miaozhixu commented 1 week ago

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

Bring up a fresh Ubuntu 22.04.4 installation. It comes with a NVidia GPU driver 535. Follow the document on nivida.com to install CUDA 12.1 and cudnn 9. Install pytorch 2.3.1 with CUDA support. Use pip to install xlstm, I recommend you install xlstm v1.0.3.
I try 1.0.4 yestoday, the return_last_state parameter lead to an error.

But somebody say that this issue could solve with "conda install cccl". Check the link below. https://github.com/NX-AI/xlstm/issues/19#issuecomment-2162005087

Marco-Nguyen commented 1 week ago

So you installed xlstm via pip, not by cloning the repo, right?

miaozhixu commented 1 week ago

So you installed xlstm via pip, not by cloning the repo, right?

Yep

yongyin-ma commented 1 week ago

check your log file, there should be few Fatal error or says some file doesn't exist. I have the same error under Win 10 then I tried linux, still has the problem. I thought it might be some root or environment problem. ECHO your $PATH and $LD_LIBRARY_PATH see if you have cuda path, if not, following my step below. Remember to checked your cuda installation location, once you have find your cuda installation location then use """ export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH """ Remember to change the path with your own one This will solve the problem. This one works on me. The other solution "pip install cccl" doesn't works in my situation.

Also, under Readme author told us using "python experiments/main.py --config experiments/parity_xLSTM01.yaml" which I have got no such a file error, there is no upper case in the name of the yaml file

And you should use pip install xlstm=1.0.3, there are other bugs in 1.0,4

God-YYB commented 1 week ago

check your log file, there should be few Fatal error or says some file doesn't exist. I have the same error under Win 10 then I tried linux, still has the problem. I thought it might be some root or environment problem. ECHO your $PATH and $LD_LIBRARY_PATH see if you have cuda path, if not, following my step below. Remember to checked your cuda installation location, once you have find your cuda installation location then use """ export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH """ Remember to change the path with your own one This will solve the problem. This one works on me. The other solution "pip install cccl" doesn't works in my situation.

Also, under Readme author told us using "python experiments/main.py --config experiments/parity_xLSTM01.yaml" which I have got no such a file error, there is no upper case in the name of the yaml file

And you should use pip install xlstm=1.0.3, there are other bugs in 1.0,4

The yaml file has a case error that is easy to resolve. I will try the PATH solution you mentioned. Thank you

God-YYB commented 1 week ago

Would you please paste the log of your output? I face the same problem on windows 11, it's caused by the CUDA libraries, because Ninja assemble the nvcc command line incorrectly, leading the program cannot find CUDA libraries. I guess it was the space char in path string. But on ubuntu 22..04, with NVidia driver 535, CUDA 12.1, cudnn 9, pytorch 2.3.1, and Ninja installed, it has no problem. PS: the new version of xlstm 1.0.4 has something wrong in slstm layer src, you should try 1.0.3 on Ubuntu.

Thank you,i will try pip install xlstm v1.0.3. later ,and the version details are very helpful!

yanpeng0520 commented 1 week ago

Hi, I am trying to install the same version as you, but when I install cudnn>9, it return error: torch 2.3.1+cu121 requires nvidia-cudnn-cu12==8.9.2.26, how did you do that?

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

Bring up a fresh Ubuntu 22.04.4 installation. It comes with a NVidia GPU driver 535. Follow the document on nivida.com to install CUDA 12.1 and cudnn 9. Install pytorch 2.3.1 with CUDA support. Use pip to install xlstm, I recommend you install xlstm v1.0.3. I try 1.0.4 yestoday, the return_last_state parameter lead to an error.

But somebody say that this issue could solve with "conda install cccl". Check the link below. #19 (comment)

miaozhixu commented 1 week ago

nvidia-cudnn-cu12==8.9.2.26

I follow the nvidia.com's guide to install cudnn and cuda, after successfully install these two libs, install the pytorch.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get install cuda-toolkit-12-1 then add /usr/local/cuda/bin to PATH sudo apt-get install cudnn9-cuda-12 finally install pytorch: conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia I use anaconda 3, but I think pip will work just fine.

Atlantis-esh commented 4 days ago

Would you please paste the log of your output? I face the same problem on windows 11, it's caused by the CUDA libraries, because Ninja assemble the nvcc command line incorrectly, leading the program cannot find CUDA libraries. I guess it was the space char in path string. But on ubuntu 22..04, with NVidia driver 535, CUDA 12.1, cudnn 9, pytorch 2.3.1, and Ninja installed, it has no problem. PS: the new version of xlstm 1.0.4 has something wrong in slstm layer src, you should try 1.0.3 on Ubuntu.

Thank you,i will try pip install xlstm v1.0.3. later ,and the version details are very helpful!

How do you all switch to the Ubuntu version? Isn't it normal to use a laboratory server? Isn't the Ubuntu system on one server fixed? Or are you using a virtual machine:(

Atlantis-esh commented 4 days ago

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

Bring up a fresh Ubuntu 22.04.4 installation. It comes with a NVidia GPU driver 535. Follow the document on nivida.com to install CUDA 12.1 and cudnn 9. Install pytorch 2.3.1 with CUDA support. Use pip to install xlstm, I recommend you install xlstm v1.0.3. I try 1.0.4 yestoday, the return_last_state parameter lead to an error.

But somebody say that this issue could solve with "conda install cccl". Check the link below. #19 (comment) How do you all switch the Ubuntu version to 22.04? Isn't it normal to use a laboratory server? Isn't the Ubuntu system on one server fixed? Or are you all using a virtual machine:(

Atlantis-esh commented 4 days ago

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

Bring up a fresh Ubuntu 22.04.4 installation. It comes with a NVidia GPU driver 535. Follow the document on nivida.com to install CUDA 12.1 and cudnn 9. Install pytorch 2.3.1 with CUDA support. Use pip to install xlstm, I recommend you install xlstm v1.0.3. I try 1.0.4 yestoday, the return_last_state parameter lead to an error.

But somebody say that this issue could solve with "conda install cccl". Check the link below. #19 (comment)

Hello, May I ask you how do you switch the Ubuntu version to 22.04? Isn't it normal to use a laboratory server? Isn't the Ubuntu system on one server fixed? Or are you all using a virtual machine:(

miaozhixu commented 2 days ago

@miaozhixu Hi, may I ask about your setup steps? I am facing the above issues and could not find a solution up to now

Bring up a fresh Ubuntu 22.04.4 installation. It comes with a NVidia GPU driver 535. Follow the document on nivida.com to install CUDA 12.1 and cudnn 9. Install pytorch 2.3.1 with CUDA support. Use pip to install xlstm, I recommend you install xlstm v1.0.3. I try 1.0.4 yestoday, the return_last_state parameter lead to an error. But somebody say that this issue could solve with "conda install cccl". Check the link below. #19 (comment)

Hello, May I ask you how do you switch the Ubuntu version to 22.04? Isn't it normal to use a laboratory server? Isn't the Ubuntu system on one server fixed? Or are you all using a virtual machine:(

Ubuntu installed on my Laptop, along side with windows 11. This laptop has a RTX5000 GPU.