state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
12.44k stars 1.05k forks source link

triton error #365

Open missingthl opened 3 months ago

missingthl commented 3 months ago

Screenshot from 2024-06-05 22-23-59

Encountered an error here RuntimeError: Triton Error [CUDA]: misaligned address :-(

Kevin-naticl commented 3 months ago

trition>=2.1.0 is needed, but I have another trition problem.

Kevin-naticl commented 3 months ago

Here is my issues:https://github.com/state-spaces/mamba/issues/370

mahao18cm commented 2 months ago

Do you solve this problems? I still have this problems.

Kevin-naticl commented 2 months ago

I still have this problem. Maybe try to find out the problem of training code. I'm doing this right now.

------------------ 原始邮件 ------------------ 发件人: "state-spaces/mamba" @.>; 发送时间: 2024年6月16日(星期天) 中午1:52 @.>; @.**@.>; 主题: Re: [state-spaces/mamba] triton error (Issue #365)

Do you solve this problems? I still have this problems.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

missingthl commented 2 months ago

trition>=2.1.0 is needed, but I have another trition problem.

My device is also 4090. My environment is CUDA 11.8, PyTorch 2.1.2, causal-conv1d 1.1.3. can run mamba1 correctly. I am trying to reconfigure the environment. I saw your problem, hope it's an environment configuration issue; otherwise, it might be too complicated to solve.

mahao18cm commented 2 months ago

I still have this problem. Maybe try to find out the problem of training code. I'm doing this right now. ------------------ 原始邮件 ------------------ 发件人: "state-spaces/mamba" @.>; 发送时间: 2024年6月16日(星期天) 中午1:52 @.>; @.**@.>; 主题: Re: [state-spaces/mamba] triton error (Issue #365) Do you solve this problems? I still have this problems. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Actually mamba2 is a trash compared to mamba1. I just replace mamba1 now. To my surprise, mamba2 need more memory.

Kevin-naticl commented 2 months ago

Thx for reply. My device is also 4090. When I set my model eval mode, it works fine. However when I use multi-GPU to train it, it has the problem. I'm trying to rewrite my train code.   

康博意 @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年6月16日(星期天) 下午2:55 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [state-spaces/mamba] triton error (Issue #365)

trition>=2.1.0 is needed, but I have another trition problem.

My device is also 4090. My environment is CUDA 11.8, PyTorch 2.1.2, causal-conv1d 1.1.3. can run mamba1 correctly. I am trying to reconfigure the environment. I saw your problem, hope it's an environment configuration issue; otherwise, it might be too complicated to solve.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

weili419 commented 3 days ago

Hello, has the suspected 'memory misalignment' error been resolved? Can you share your experience?

missingthl commented 3 days ago

您好,疑似的“内存未对齐”错误是否已解决?您能分享一下您的经验吗?

可能是triton版本的问题,我去mamba的项目和causal的项目去找对应的版本重新配环境跑通了

Kevin-naticl commented 3 days ago

应该就是和cuda版本还有triton版本都有关系。看作者发布的包的话,cuda12.2以及之后应该和之前有一些区别。你可以试一下。 

康博意 @.***

 

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年9月3日(星期二) 下午2:29 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [state-spaces/mamba] triton error (Issue #365)

Hello, has the suspected 'memory misalignment' error been resolved? Can you share your experience?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>