mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.87k stars 1.54k forks source link

[Feature Request] AssertionError: sm61 not supported yet. #802

Closed pangr closed 11 months ago

pangr commented 1 year ago

🚀 Feature

when i execute python3 -m mlc_llm.build --model xxxx --target cuda --quantization q4f16_1 on 1080ti report AssertionError: sm61 not supported yet.

Motivation

Alternatives

Additional context

junrushao commented 1 year ago

Well. 1080ti is a bit too old to test at this moment. How about using the Vulkan backend instead of cuda?

Matthieu-Tinycoaching commented 1 year ago

Same here with GTX 1060.

How about Vulkan backend? How is it working with NVIDIA GPU?

tqchen commented 11 months ago

you can follow the instruction with vulkan for this device in llm.mlc.ai/docs

gatepoet commented 11 months ago

@tqchen : I'd like to join in on this one. I have an old mining rig with 11x GTX 1060 6GB, so I'd like to use CUDA for parallelization/batching.

tqchen commented 11 months ago

You can try to hack and remove that assert from the build python function. Note that the latest cuda runtime likely deprecate sm_61 support , so you might need to work on a older one.

In the mean time, vulkan should be supported out of box.

junrushao commented 11 months ago

I do think our CUDA backend supports sm 61 out of box, but you will need to compile it yourself. Use extra flags to disable cutlass: “--no-cutlass-attn” and “--no-cutlass-norm”

junrushao commented 11 months ago

We should have a fallback that turns off cutlass automatically for non-sm7x/8x devices