mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.08k stars 1.56k forks source link

[Bug] AssertionError: sm90 not supported yet - running a model on Nvidia H100 #790

Closed daniel-kukiela closed 11 months ago

daniel-kukiela commented 1 year ago

🐛 Bug

I'm not sure if this is a bug or should be a feature request. If you have "too new" GPU, you cannot use it. With the model compilation it errors:

[...]
  File "/usr/local/lib/python3.8/dist-packages/tvm/contrib/cutlass/gen_gemm.py", line 197, in __init__
    assert sm in GENERATOR_FUNC_TABLE and sm in DEFAULT_KERNELS, f"sm{sm} not supported yet."
AssertionError: sm90 not supported yet.

Is there a way to use the highest supported sm flag set or to specify one to use for the model compilation?

To Reproduce

Steps to reproduce the behavior:

  1. Use unsupported GPU

Expected behavior

The model can be compiled and run even if it does not utilize all the capabilities of the hardware.

Environment

junrushao commented 11 months ago

H100 is indeed supported, but you will have to disable cutlass for now using --no-cutlass-attn and --no-cutlass-norm

technillogue commented 11 months ago

is there any performance penalty from not using cutlass?