mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.99k stars 1.56k forks source link

Fail to build tvm-unity from source on orin[Bug] #2389

Closed Louym closed 4 months ago

Louym commented 5 months ago

🐛 Bug

When build tvm-unity from source on my platform, it fails to produce libtvm_runtime. Errors are something like this: image

To Reproduce

I just follow the document and my settings are: image

Expected behavior

I expect to successfully produce libtvm_runtime and libtvm. And I can install tvm-unity in my virtual environment.

Environment

oglok commented 4 months ago

Hi @Louym did you have any luck building mlc-llm and tvm on an Orin? I'm trying to go through the same, and I'm having a really hard time.

Louym commented 4 months ago

Not yet. @oglok Indeed, I just want the nsys results of MLC_LLM. And now I am able to use MLC_LLM in Jetson containers. I also tried to pip install tvm, which has prepared wheels for arm64 platform, but it seems MLC_LLM strictly requires the relax version of tvm. You may find more in this issue: #https://github.com/dusty-nv/jetson-containers/issues/531

MasterJH5574 commented 4 months ago

Hey @Louym @oglok sorry for the late reply. Could you please check out to the latest tvm, add the following lines into build/cmake.config and try build again? There were some changes in the FlashInfer build (where the error you see happens) so just wonder if the latest tvm commit can work.

set(USE_FLASHINFER ON)
set(FLASHINFER_ENABLE_FP8 OFF)
set(FLASHINFER_ENABLE_BF16 OFF)
set(FLASHINFER_GEN_GROUP_SIZES 1 4 6 8)
set(FLASHINFER_GEN_PAGE_SIZES 16)
set(FLASHINFER_GEN_HEAD_DIMS 128)
set(FLASHINFER_GEN_KV_LAYOUTS 0 1)
set(FLASHINFER_GEN_POS_ENCODING_MODES 0 1)
set(FLASHINFER_GEN_ALLOW_FP16_QK_REDUCTIONS "false")
set(FLASHINFER_GEN_CASUALS "false" "true")
Louym commented 4 months ago

@MasterJH5574 Thank you very much! It perfectly works out!