mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.24k stars 1.58k forks source link

[Bug] TVM Unity Compiler Build Errors on Orange Pi 5 Max #2993

Closed limcheekin closed 3 weeks ago

limcheekin commented 4 weeks ago

🐛 Bug

I try to build TVM Unity compiler from source by follow instructions at https://llm.mlc.ai/docs/install/tvm.html#option-2-build-from-source.

No error when I build the tvm runtime only using the command cmake .. && cmake --build . --target runtime --parallel $(nproc).

But there are some errors when building with the command cmake .. && cmake --build . --parallel $(nproc)

The following is build log:

$ cmake .. && cmake --build . --parallel $(nproc)
-- Hide private symbols...
-- Forbidding undefined symbols in shared library, using -Wl,--no-undefined on platform Linux
-- Build with RPC support...
-- Build with Graph Executor support...
-- Build with profiler...
-- Build with AOT Executor support...
-- Could NOT find GTest (missing: GTEST_LIBRARY GTEST_INCLUDE_DIR GTEST_MAIN_LIBRARY) 
-- Build Alloc alignment set to 64
-- Didn't find the path to CCACHE, disabling ccache
-- VTA build with VTA_HW_PATH=/mnt/emmc/ws/mlc.ai/tvm-unity/3rdparty/vta-hw
-- Build VTA runtime with target: sim
-- Enabled runtime search for OpenCL library location
-- Couldn't build OpenCL-Gtests
-- Use llvm-config=llvm-config --ignore-libllvm --link-static
-- LLVM libdir: /mnt/emmc/ws/mlc.ai/tvm-unity/.conda/lib
-- LLVM cmakedir: /mnt/emmc/ws/mlc.ai/tvm-unity/.conda/lib/cmake/llvm
-- LLVM linker flag: -lrt
-- LLVM linker flag: -ldl
-- LLVM linker flag: -lpthread
-- LLVM links against math
-- LLVM links against zlib
-- LLVM links against static zstd
-- LLVM links against xml2
-- Found LLVM_INCLUDE_DIRS=/mnt/emmc/ws/mlc.ai/tvm-unity/.conda/include
-- Found LLVM_DEFINITIONS=-D_GNU_SOURCE;-D__STDC_CONSTANT_MACROS;-D__STDC_FORMAT_MACROS;-D__STDC_LIMIT_MACROS
-- Found LLVM_LIBS=(Truncated by me...)
-- Found TVM_LLVM_VERSION=191
-- Found TVM_LLVM_HAS_AARCH64_TARGET=1
-- Build with LLVM 
-- Set TVM_LLVM_VERSION=191
-- Build with contrib.random
-- Build with contrib.sort
-- Build with contrib.hybriddump
-- Git found: /usr/bin/git
-- Found TVM_GIT_COMMIT_HASH=dc87019cb805d0a1f0075f6415cc979ef337ec2a
-- Found TVM_GIT_COMMIT_TIME=2024-09-28 00:31:12 -0400
-- Could NOT find LIBBACKTRACE (missing: LIBBACKTRACE_STATIC_LIBRARY LIBBACKTRACE_INCLUDE_DIR) 
-- Building libbacktrace from 3rdparty/libbacktrace
-- Building with TVM Map...
-- Build with thread support...
-- Build without FlashInfer
-- Configuring done (0.7s)
-- Generating done (0.2s)
-- Build files have been written to: /mnt/emmc/ws/mlc.ai/tvm-unity/build
...
[ 15%] Building CXX object CMakeFiles/tvm_objs.dir/src/ir/diagnostic.cc.o
In file included from /mnt/emmc/ws/mlc.ai/tvm-unity/src/autotvm/touch_extractor.cc:25:
In constructor ‘tvm::autotvm::ItervarFeature::ItervarFeature(tvm::autotvm::ItervarFeature&&)’,
    inlined from ‘constexpr std::pair<_T1, _T2>::pair(_U1&&, _U2&&) [with _U1 = tvm::tir::Var&; _U2 = tvm::autotvm::ItervarFeature; typename std::enable_if<(std::_PCC<true, _T1, _T2>::_MoveConstructiblePair<_U1, _U2>() && std::_PCC<true, _T1, _T2>::_ImplicitlyMoveConvertiblePair<_U1, _U2>()), bool>::type <anonymous> = true; _T1 = const tvm::tir::Var; _T2 = tvm::autotvm::ItervarFeature]’ at /usr/include/c++/13/bits/stl_pair.h:688:35,
    inlined from ‘virtual bool tvm::autotvm::TouchExtractor::EnterItervar_(tvm::tir::Var, int64_t, tvm::autotvm::AnnotationType)’ at /mnt/emmc/ws/mlc.ai/tvm-unity/src/autotvm/touch_extractor.cc:103:23:
/mnt/emmc/ws/mlc.ai/tvm-unity/src/autotvm/touch_extractor.h:58:8: warning: ‘<unnamed>.tvm::autotvm::ItervarFeature::bottomup_product’ may be used uninitialized [-Wmaybe-uninitialized]
   58 | struct ItervarFeature {
      |        ^~~~~~~~~~~~~~
/mnt/emmc/ws/mlc.ai/tvm-unity/src/autotvm/touch_extractor.cc: In member function ‘virtual bool tvm::autotvm::TouchExtractor::EnterItervar_(tvm::tir::Var, int64_t, tvm::autotvm::AnnotationType)’:
/mnt/emmc/ws/mlc.ai/tvm-unity/src/autotvm/touch_extractor.cc:105:84: note: ‘<anonymous>’ declared here
  105 |                              topdown_product_, static_cast<int>(itervar_counter_++))});
      |                
...
[ 61%] Building CXX object CMakeFiles/tvm_objs.dir/src/tir/transforms/lower_init_block.cc.o
/mnt/emmc/ws/mlc.ai/tvm-unity/src/tir/transforms/lower_cross_thread_reduction.cc: In function ‘tvm::tir::Stmt tvm::tir::TransformReductionBlock(const BlockRealizeNode*, const tvm::runtime::Optional<tvm::runtime::Array<Buffer> >&, const tvm::runtime::Array<Buffer>&, const tvm::runtime::Array<Buffer>&, const tvm::runtime::Array<tvm::PrimExpr>&, const CommReducer&, const tvm::runtime::Array<tvm::PrimExpr>&, const std::vector<const ForNode*>&)’:
/mnt/emmc/ws/mlc.ai/tvm-unity/src/tir/transforms/lower_cross_thread_reduction.cc:351:24: warning: moving ‘new_block.tvm::runtime::ObjectPtr<tvm::tir::BlockNode>::operator->()->tvm::tir::BlockNode::reads’ of type ‘tvm::runtime::Array<tvm::tir::BufferRegion>’ to itself [-Wself-move]
  351 |       new_block->reads = std::move(new_block->reads);
      |       ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/emmc/ws/mlc.ai/tvm-unity/src/tir/transforms/lower_cross_thread_reduction.cc:351:24: note: remove ‘std::move’ call

The build log is long, it is hard to copy from the console screen. Let's me know if you need more information and the complete build log, I will re-run the build and store the log to log file.

Thanks in advance.

Expected behavior

It should build successfully without errors.

Environment

nihalgeorge01 commented 4 weeks ago

Investigating, thanks for the report! A possibly related issue was found in https://github.com/mlc-ai/relax/issues/325 and downgrading LLVM to <= 18 seemed to work. The specific error strings were different from the ones you mentioned, but as a short-term workaround, could you try the build process with LLVM <= 18?

limcheekin commented 3 weeks ago

@nihalgeorge01 Thanks for quick response, it works! :)

Even there are some errors on my console during the build process with LLVM <= 18, I managed to run mlc_llm chat HF://mlc-ai/Llama-3.2-1B-Instruct-q4f16_0-MLC after the build and chat with it on my console.

I have a newbie question: How do I know which models hosted at https://huggingface.co/mlc-ai is compatible with and runnable on Orange Pi 5 Max? I tried HF://mlc-ai/Llama-3-8B-Instruct-fp8-MLC, it seems loaded successfully, but the response text is many exclamation marks (!)?

Given the following models:

I understood that q4f32 will generate better response than q4f16, q4 is better than q0, is my understanding correct? What is the _0 in q4f16_0 and _1 in q4f32_1 for models above?

Please advise. Thank you.