Fail to build tvm-unity from source on orin[Bug]

Louym commented 5 months ago

🐛 Bug

When build tvm-unity from source on my platform, it fails to produce libtvm_runtime. Errors are something like this:

To Reproduce

I just follow the document and my settings are:

Expected behavior

I expect to successfully produce libtvm_runtime and libtvm. And I can install tvm-unity in my virtual environment.

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): CUDA
Operating system (e.g. Ubuntu/Windows/MacOS/...): Ubuntu
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Jetson orin
How you installed MLC-LLM (conda, source): not installed yet
How you installed TVM-Unity (pip, source): source
Python version (e.g. 3.10): 3.11
GPU driver version (if applicable): Jetpack 6.0-b52
CUDA/cuDNN version (if applicable): 12.2
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
Any other relevant information: arm64 architecture

oglok commented 4 months ago

Hi @Louym did you have any luck building mlc-llm and tvm on an Orin? I'm trying to go through the same, and I'm having a really hard time.

Louym commented 4 months ago

Not yet. @oglok Indeed, I just want the nsys results of MLC_LLM. And now I am able to use MLC_LLM in Jetson containers. I also tried to pip install tvm, which has prepared wheels for arm64 platform, but it seems MLC_LLM strictly requires the relax version of tvm. You may find more in this issue: #https://github.com/dusty-nv/jetson-containers/issues/531

MasterJH5574 commented 4 months ago

Hey @Louym @oglok sorry for the late reply. Could you please check out to the latest tvm, add the following lines into build/cmake.config and try build again? There were some changes in the FlashInfer build (where the error you see happens) so just wonder if the latest tvm commit can work.

set(USE_FLASHINFER ON)
set(FLASHINFER_ENABLE_FP8 OFF)
set(FLASHINFER_ENABLE_BF16 OFF)
set(FLASHINFER_GEN_GROUP_SIZES 1 4 6 8)
set(FLASHINFER_GEN_PAGE_SIZES 16)
set(FLASHINFER_GEN_HEAD_DIMS 128)
set(FLASHINFER_GEN_KV_LAYOUTS 0 1)
set(FLASHINFER_GEN_POS_ENCODING_MODES 0 1)
set(FLASHINFER_GEN_ALLOW_FP16_QK_REDUCTIONS "false")
set(FLASHINFER_GEN_CASUALS "false" "true")

Louym commented 4 months ago

@MasterJH5574 Thank you very much! It perfectly works out!

mlc-ai / mlc-llm