shawntan / scattermoe

Triton-based implementation of Sparse Mixture of Experts.
Apache License 2.0
150 stars 10 forks source link

Segfault CUDA 12.2 #4

Open sshleifer opened 4 months ago

sshleifer commented 4 months ago

Which versions of pytorch/triton/hardware do you run this on?

Traceback

tests/test_mlp.py Fatal Python error: Segmentation fault

Thread 0x00007f23c08b5640 (most recent call first):
  <no Python frame>

...

Thread 0x00007f23c28b9640 (most recent call first):
  <no Python frame>

My Env

I have CUDA 12.1, H100

triton==2.1.0+git17d633a64
torch==2.0.1+gite9ebda2

What I ran:

git clone git@github.com:shawntan/scattermoe.git
pip install -e .
CUDA_LAUNCH_BLOCKING=1 pytest tests/  --maxfail=1
findmyway commented 3 months ago

I use nvcr.io/nvidia/pytorch:23.10-py3

torch: 2.1.0a0+32f93b1 triton: 2.2.0 CUDA: 12.2