Closed Louym closed 4 months ago
Hi @Louym did you have any luck building mlc-llm and tvm on an Orin? I'm trying to go through the same, and I'm having a really hard time.
Not yet. @oglok Indeed, I just want the nsys results of MLC_LLM. And now I am able to use MLC_LLM in Jetson containers. I also tried to pip install tvm, which has prepared wheels for arm64 platform, but it seems MLC_LLM strictly requires the relax version of tvm. You may find more in this issue: #https://github.com/dusty-nv/jetson-containers/issues/531
Hey @Louym @oglok sorry for the late reply. Could you please check out to the latest tvm, add the following lines into build/cmake.config
and try build again? There were some changes in the FlashInfer build (where the error you see happens) so just wonder if the latest tvm commit can work.
set(USE_FLASHINFER ON)
set(FLASHINFER_ENABLE_FP8 OFF)
set(FLASHINFER_ENABLE_BF16 OFF)
set(FLASHINFER_GEN_GROUP_SIZES 1 4 6 8)
set(FLASHINFER_GEN_PAGE_SIZES 16)
set(FLASHINFER_GEN_HEAD_DIMS 128)
set(FLASHINFER_GEN_KV_LAYOUTS 0 1)
set(FLASHINFER_GEN_POS_ENCODING_MODES 0 1)
set(FLASHINFER_GEN_ALLOW_FP16_QK_REDUCTIONS "false")
set(FLASHINFER_GEN_CASUALS "false" "true")
@MasterJH5574 Thank you very much! It perfectly works out!
🐛 Bug
When build tvm-unity from source on my platform, it fails to produce libtvm_runtime. Errors are something like this:
To Reproduce
I just follow the document and my settings are:
Expected behavior
I expect to successfully produce libtvm_runtime and libtvm. And I can install tvm-unity in my virtual environment.
Environment
conda
, source): not installed yetpip
, source): sourcepython -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models):