chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.16k stars 70 forks source link

gcc "-std=99" may need to be declared in code #147

Open Fengshawn opened 4 months ago

Fengshawn commented 4 months ago

I am currently using the environment with gcc version 4.8.5 which I cannot change or update it, cause it is a public environment...

/tmp/tmp4s6z8k3o/main.c: 在函数‘list_to_cuuint64_array’中: /tmp/tmp4s6z8k3o/main.c:354:3: 错误:只允许在 C99 模式下使用‘for’循环初始化声明 for (Py_ssize_t i = 0; i < len; i++) { ^ /tmp/tmp4s6z8k3o/main.c:354:3: 附注:使用 -std=c99 或 -std=gnu99 来编译您的代码 /tmp/tmp4s6z8k3o/main.c: 在函数‘list_to_cuuint32_array’中: /tmp/tmp4s6z8k3o/main.c:365:3: 错误:只允许在 C99 模式下使用‘for’循环初始化声明 for (Py_ssize_t i = 0; i < len; i++) { ^

graph(%input, %num_groups, %weight, %bias, %eps, %cudnn_enabled): %y = sfast_triton::group_norm_silu(%input, %num_groups, %weight, %bias, %eps)


    return (%y)
RuntimeError: CalledProcessError: Command **'['/usr/bin/gcc', '/tmp/tmp4s6z8k3o/main.c', '-O3',
'-I/home/dell/miniconda3/envs/fx-clone-project-base/lib/python3.10/site-packages/triton/common/../third_party/cuda/include', '-I/home/dell/miniconda3/envs/fx-clone-project-base/include/python3.10',
'-I/tmp/tmp4s6z8k3o', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmp4s6z8k3o/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-L/lib64', '-L/lib', '-L/lib64', '-L/lib']'** returned non-zero exit status 1.

I think in my local environment I only need to change from ['/usr/bin/gcc', '/tmp/tmp4s6z8k3o/main.c' ...] to something like ['/usr/bin/gcc', '**-std=c99**' , '/tmp/tmp4s6z8k3o/main.c']... but how should I locate this part of code?

Thanks in advance!