mlc-ai / relax

Apache License 2.0
137 stars 69 forks source link

[Bug] ACL Core Library no longer apart of Arm Compute Library -> Failure to Compile TVM's ACL Contrib. Lib. #321

Open BuildBackBuehler opened 1 month ago

BuildBackBuehler commented 1 month ago

What Happened

When I attempt to build TVM-Unity I run into errors with the ACL Lib. I believe/hope it is because of this change not being reflected within the TVM ACL CMake Config.

The 2 errors that trip up the build are...

All the files are in /src/runtime/contrib/arm_compute_lib/ of course.

Edit: This actually affects more than just the ACL Contrib. comp. but also the Rust build --

gmake[2]: *** [CMakeFiles/tvm_runtime_objs.dir/build.make:1098: CMakeFiles/tvm_runtime_objs.dir/src/runtime/contrib/arm_compute_lib/acl_runtime.cc.o] Error 1
error: failed to run custom build command for `tvm-sys v0.1.1-alpha (/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/rust/tvm-sys)`

Environment

Mac 14.4, Silicon/Metal/MPS/aarm64, whatever you want to call it. Because Relax doesn't have a TVM-Unity ver., don't have that readily accessible, but, latest 😅 CMake 3.29.2, LLVM 18.1.6: CLANG, -std=c++17

Steps to reproduce

//Path to a library. EXTERN_ACL_COMPUTE_CORE_LIB:FILEPATH=OFF

//Path to a library. EXTERN_ACL_COMPUTE_GRAPH_LIB:FILEPATH=/Users/zack/.home/gitrepos/LLMLife/backend/Misc/ComputeLibrary-24.04/arm_compute/build/libarm_compute_graph.dylib

//Path to a library. EXTERN_ACL_COMPUTE_LIB:FILEPATH=/Users/zack/.home/gitrepos/LLMLife/backend/Misc/ComputeLibrary-24.04/arm_compute/build/libarm_compute.dylib

There is still a core lib., it just isn't associated with a .dylib so one cannot provide the core lib dir as the filepath.

Notes

Besides the aforementioned issue, the other elephant in the room is that I am doing this on a Mac which leads to a shaky from-source Arm Compute Library build that doesn't have a full suite of functionalities. A few folks have been requesting them to provide prebuilt Mac packages (I mention because I'm sure that'd at least ensure an error-free compilation, no loose ends).

I doubt these errors are related and I imagine this is something that routinely causes issues in packages but just to be thorough (and maybe get help 😂):

 --- stderr
  error: header '/Users/zack/.home/gitrepos/LLMLife/backend/tvm/include/tvm/runtime/c_backend_api.h' does not exist.
  Error: bindgen failed to generate the Rust bindings for the C API
gmake[2]: *** [CMakeFiles/rust_ext.dir/build.make:73: /Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/rust/target/release/libcompiler_ext.so] Error 101
gmake[1]: *** [CMakeFiles/Makefile2:152: CMakeFiles/rust_ext.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/gemm.mm:48:53: error: no member named 'GetCommandQueue' in 'tvm::runtime::metal::MetalWorkspace'
   48 |   id<MTLCommandQueue> queue = entry_ptr->metal_api->GetCommandQueue(A->device);
      |                               ~~~~~~~~~~~~~~~~~~~~  ^
1 error generated.
gmake[2]: *** [CMakeFiles/tvm_runtime_objs.dir/build.make:1000: CMakeFiles/tvm_runtime_objs.dir/src/runtime/contrib/mps/gemm.mm.o] Error 1
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/conv.mm:36:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
   36 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)mtlbuf, 0, (__bridge void*)temp, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/conv.mm:72:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
   72 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)temp, 0, (__bridge void*)mtlbuf, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/conv.mm:106:53: error: no member named 'GetCommandQueue' in 'tvm::runtime::metal::MetalWorkspace'
  106 |   id<MTLCommandQueue> queue = entry_ptr->metal_api->GetCommandQueue(data->device);
      |                               ~~~~~~~~~~~~~~~~~~~~  ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/conv.mm:115:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
  115 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)bufB, 0, (__bridge void*)tempB, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm-unity/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
4 errors generated.

Going to look into this one online as mentioned, so I imagine I'll be able to clear it up.

Solution

Obviously I don't readily have one. I figure my best option is to use CMake functions to link my IAllocator.h, Types.h to the complimenting TVM contrib. files.

I tried this but I'm probably doing something wrong...doesn't help that 1 or 2 of the problematic headers is used in multiple files AFAIK/AFAIR.

find_package(/src/runtime/contrib/arm_compute_lib/acl_utils COMPONENTS arm_compute/core/Types)
target_link_libraries(/Users/zack/.home/gitrepos/LLMLife/backend/Misc/ComputeLibrary-24.04/arm_compute/core/Types PRIVATE acl_utils::arm_compute/core/Types)

find_package(/src/runtime/contrib/arm_compute_lib/acl_allocator COMPONENTS arm_compute/runtime/IAllocator)
target_link_libraries(/Users/zack/.home/gitrepos/LLMLife/backend/Misc/ComputeLibrary-24.04/arm_compute/runtime/IAllocator PRIVATE acl_allocator::arm_compute/runtime/IAllocator)
BuildBackBuehler commented 1 month ago

Buehler?

Turns out this issue is deeper than I thought. Though, at the same time, it is relatively straight-forward. Seems entirely tied up on integrating ACL Lib/Graph. This is traditionally an Intel-centric library, as the dominant ARM producer prior to Apple's M-series. ACL does compile, it isn't pretty though. But anyways, I guess without a more experienced user's intervention, the data traded betw. ACL and one's computer does not transfer over to TVM. I'd imagine if I exported a profile from the ACL Lib. for LLVM, then my compilation wouldn't get tripped up. RIght now getting multiple errors regarding the target of the compilation. Codegen_cpu, codegen_amdgpu, codegen_aarch64 are some I see.

But I suppose the code will speak better than I can.

ld64.lld: error: undefined symbol: llvm::DisableABIBreakingChecks
>>> referenced by /Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/parsers/aprofile.cc
>>>               CMakeFiles/tvm_objs.dir/src/target/parsers/aprofile.cc.o:(symbol ltmp6+0x0)

ld64.lld: error: undefined symbol: llvm::Value::getName() const
>>> referenced by codegen_aarch64.cc:100 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:100)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::VisitStmt_(tvm::tir::AttrStmtNode const*)+0x87c)
>>> referenced by codegen_aarch64.cc:94 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:94)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::VisitStmt_(tvm::tir::AttrStmtNode const*)+0x7d4)
>>> referenced by codegen_llvm.cc:2207 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_llvm.cc:2207)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_llvm.cc.o:(symbol tvm::codegen::CodeGenLLVM::AddDebugInformation(llvm::Function*, tvm::runtime::Array<tvm::Type, void> const&)+0x238)
>>> referenced 1 more times

ld64.lld: error: undefined symbol: llvm::Function::addFnAttr(llvm::StringRef, llvm::StringRef)
>>> referenced by codegen_aarch64.cc:101 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:101)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::VisitStmt_(tvm::tir::AttrStmtNode const*)+0x5b4)
>>> referenced by codegen_aarch64.cc:95 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:95)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::VisitStmt_(tvm::tir::AttrStmtNode const*)+0x4e8)
>>> referenced by codegen_amdgpu.cc:96 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:96)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::CodeGenAMDGPU::AddFunction(tvm::GlobalVar const&, tvm::tir::PrimFunc const&)+0x228)
>>> referenced 2 more times

ld64.lld: error: undefined symbol: llvm::Function::addFnAttr(llvm::Attribute)
>>> referenced by codegen_aarch64.cc:62 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:62)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::SetTargetAttributes(llvm::Function*)+0x108)

ld64.lld: error: undefined symbol: llvm::Attribute::getWithVScaleRangeArgs(llvm::LLVMContext&, unsigned int, unsigned int)
>>> referenced by codegen_aarch64.cc:63 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_aarch64.cc:63)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_aarch64.cc.o:(symbol tvm::codegen::CodeGenAArch64::SetTargetAttributes(llvm::Function*)+0xfc)

ld64.lld: error: undefined symbol: llvm::DataLayout::~DataLayout()
>>> referenced by codegen_amdgpu.cc:275 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:275)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x10e4)
>>> referenced by codegen_amdgpu.cc:275 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:275)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x3e0)
>>> referenced by codegen_blob.cc:74 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_blob.cc:74)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_blob.cc.o:(symbol tvm::codegen::CodeGenBlob(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, bool, tvm::codegen::LLVMTarget*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)+0xd14)
>>> referenced 25 more times

ld64.lld: error: undefined symbol: llvm::raw_ostream::~raw_ostream()
>>> referenced by object.h:0 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/include/tvm/runtime/object.h:0)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x1074)
>>> referenced by object.h:0 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/include/tvm/runtime/object.h:0)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x106c)
>>> referenced by object.h:0 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/include/tvm/runtime/object.h:0)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x1064)
>>> referenced 24 more times

ld64.lld: error: undefined symbol: llvm::legacy::PassManager::~PassManager()
>>> referenced by codegen_amdgpu.cc:344 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:344)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x1030)
>>> referenced by codegen_amdgpu.cc:344 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:344)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x1010)
>>> referenced by codegen_amdgpu.cc:344 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:344)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0xb30)
>>> referenced 9 more times

ld64.lld: error: undefined symbol: llvm::Module::~Module()
>>> referenced by unique_ptr.h:66 (/opt/homebrew/opt/llvm/bin/../include/c++/v1/__memory/unique_ptr.h:66)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0xbb0)
>>> referenced by unique_ptr.h:66 (/opt/homebrew/opt/llvm/bin/../include/c++/v1/__memory/unique_ptr.h:66)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0xb54)
>>> referenced by unique_ptr.h:66 (/opt/homebrew/opt/llvm/bin/../include/c++/v1/__memory/unique_ptr.h:66)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0xb40)
>>> referenced 41 more times

ld64.lld: error: undefined symbol: llvm::legacy::PassManager::run(llvm::Module&)
>>> referenced by codegen_amdgpu.cc:331 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:331)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x778)
>>> referenced by codegen_amdgpu.cc:312 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:312)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x668)
>>> referenced by codegen_hexagon.cc:611 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_hexagon.cc:611)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::$_1::operator()(llvm::Module const&, tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::CodeGenFileType) const+0x170)
>>> referenced 3 more times

ld64.lld: error: undefined symbol: llvm::legacy::PassManager::PassManager()
>>> referenced by codegen_amdgpu.cc:315 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:315)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x740)
>>> referenced by codegen_amdgpu.cc:297 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:297)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x630)
>>> referenced by codegen_hexagon.cc:608 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_hexagon.cc:608)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::$_1::operator()(llvm::Module const&, tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::CodeGenFileType) const+0x130)
>>> referenced 3 more times

ld64.lld: error: undefined symbol: llvm::CloneModule(llvm::Module const&)
>>> referenced by codegen_amdgpu.cc:295 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:295)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x628)
>>> referenced by codegen_amdgpu.cc:294 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:294)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x61c)
>>> referenced by codegen_hexagon.cc:607 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_hexagon.cc:607)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::$_1::operator()(llvm::Module const&, tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::CodeGenFileType) const+0x128)
>>> referenced 3 more times

ld64.lld: error: undefined symbol: llvm::Module::print(llvm::raw_ostream&, llvm::AssemblyAnnotationWriter*, bool, bool) const
>>> referenced by codegen_amdgpu.cc:289 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:289)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x610)
>>> referenced by codegen_hexagon.cc:591 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_hexagon.cc:591)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)+0xa9c)
>>> referenced by codegen_hexagon.cc:591 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_hexagon.cc:591)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::$_1::operator()(llvm::Module const&, tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::CodeGenFileType) const+0xb0)
>>> referenced 3 more times

ld64.lld: error: undefined symbol: llvm::raw_ostream::SetBufferAndMode(char*, unsigned long, llvm::raw_ostream::BufferKind)
>>> referenced by raw_ostream.h:174 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:174)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x5f8)
>>> referenced by raw_ostream.h:174 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:174)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x5cc)
>>> referenced by raw_ostream.h:174 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:174)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x5a0)
>>> referenced 15 more times

ld64.lld: error: undefined symbol: llvm::raw_ostream::flush_nonempty()
>>> referenced by raw_ostream.h:187 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:187)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x5e4)
>>> referenced by raw_ostream.h:187 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:187)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x5b8)
>>> referenced by raw_ostream.h:187 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:187)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x58c)
>>> referenced 2 more times

ld64.lld: error: undefined symbol: vtable for llvm::raw_svector_ostream
>>> referenced by raw_ostream.h:690 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:690)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x4dc)
>>> referenced by raw_ostream.h:690 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:690)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x4d8)
>>> referenced by raw_ostream.h:690 (/opt/homebrew/opt/llvm/include/llvm/Support/raw_ostream.h:690)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_hexagon.cc.o:(symbol tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::$_1::operator()(llvm::Module const&, tvm::codegen::BuildHexagon(tvm::IRModule, tvm::Target)::CodeGenFileType) const+0xfc)
>>> referenced 5 more times

ld64.lld: error: undefined symbol: llvm::Function::addFnAttr(llvm::Attribute::AttrKind)
>>> referenced by codegen_amdgpu.cc:278 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:278)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x408)
>>> referenced by codegen_cpu.cc:548 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_cpu.cc:548)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_cpu.cc.o:(symbol tvm::codegen::CodeGenCPU::CreateComputeScope(tvm::tir::AttrStmtNode const*)+0x84c)
>>> referenced by codegen_llvm.cc:391 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_llvm.cc:391)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_llvm.cc.o:(symbol tvm::codegen::CodeGenLLVM::HandleImport(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)+0x1fc)

ld64.lld: error: undefined symbol: llvm::Module::setDataLayout(llvm::DataLayout const&)
>>> referenced by codegen_amdgpu.cc:275 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:275)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::BuildAMDGPU(tvm::IRModule, tvm::Target)+0x3d8)
>>> referenced by codegen_blob.cc:74 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_blob.cc:74)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_blob.cc.o:(symbol tvm::codegen::CodeGenBlob(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, bool, tvm::codegen::LLVMTarget*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)+0x214)
>>> referenced by codegen_llvm.cc:171 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_llvm.cc:171)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_llvm.cc.o:(symbol tvm::codegen::CodeGenLLVM::InitTarget()+0x108)
>>> referenced 3 more times

ld64.lld: error: undefined symbol: llvm::Type::getPointerTo(unsigned int) const
>>> referenced by codegen_amdgpu.cc:149 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_amdgpu.cc:149)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::CodeGenAMDGPU::VisitStmt_(tvm::tir::AllocateNode const*)+0x3a4)
>>> referenced by codegen_blob.cc:109 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_blob.cc:109)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_blob.cc.o:(symbol tvm::codegen::CodeGenBlob(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, bool, tvm::codegen::LLVMTarget*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)+0x3e8)
>>> referenced by codegen_cpu.cc:163 (/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/codegen_cpu.cc:163)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_cpu.cc.o:(symbol tvm::codegen::CodeGenCPU::Init(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, tvm::codegen::LLVMTarget*, tvm::runtime::Optional<tvm::runtime::String>, bool, bool)+0x508)
>>> referenced 59 more times

ld64.lld: error: undefined symbol: llvm::ConstantInt::get(llvm::Type*, unsigned long long, bool)
>>> referenced by Constants.h:119 (/opt/homebrew/opt/llvm/include/llvm/IR/Constants.h:119)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_amdgpu.cc.o:(symbol tvm::codegen::CodeGenAMDGPU::VisitStmt_(tvm::tir::AllocateNode const*)+0x2f8)
>>> referenced by Constants.h:119 (/opt/homebrew/opt/llvm/include/llvm/IR/Constants.h:119)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_cpu.cc.o:(symbol tvm::codegen::CodeGenCPU::VisitStmt_(tvm::tir::ForNode const*)+0x86c)
>>> referenced by Constants.h:119 (/opt/homebrew/opt/llvm/include/llvm/IR/Constants.h:119)
>>>               CMakeFiles/tvm_objs.dir/src/target/llvm/codegen_cpu.cc.o:(symbol tvm::codegen::CodeGenCPU::VisitStmt_(tvm::tir::AssertStmtNode const*)+0x294)
>>> referenced 74 more times

ld64.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)

Prior to this, maybe the origin, is aprofile.cc, which is for profiling arm-cortex CPUs (due to the ACL use).

But like I said, I'm noob-ish, so with a better CMake setup from ACL into TVM (and then MLC), I imagine one can thread the needle without making concessions (I will probably just give up on ACL, TVM/MLC has compiled with little to no problems in the past.

Though I learned a lot, can't help myself, may give 'er 1 last go, clean install from step 000 and now a toolchain file heh

BuildBackBuehler commented 4 weeks ago

So I did try again, this time I nixed ACL and still ended up with these foundational errors. I can't imagine what I'd be doing each time that CMake would not like (user error).

[100%] Building CXX object CMakeFiles/tvm_runtime_objs.dir/src/runtime/contrib/coreml/coreml_runtime.mm.o
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/coreml/coreml_runtime.mm:102:7: warning: designated initializers are a C++20 extension [-Wc++20-designator]
  102 |       .device_type = kDLCPU,
      |       ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/coreml/coreml_runtime.mm:81:25: warning: comparison of integers of different signs: 'int64_t' (aka 'long long') and 'NSUInteger' (aka 'unsigned long') [-Wsign-compare]
   81 |   for (int64_t i = 0; i < data_desc.shape.count; ++i) {
      |                       ~ ^ ~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/gemm.mm:48:53: error: no member named 'GetCommandQueue' in 'tvm::runtime::metal::MetalWorkspace'
   48 |   id<MTLCommandQueue> queue = entry_ptr->metal_api->GetCommandQueue(A->device);
      |                               ~~~~~~~~~~~~~~~~~~~~  ^
1 error generated.
gmake[2]: *** [CMakeFiles/tvm_runtime_objs.dir/build.make:888: CMakeFiles/tvm_runtime_objs.dir/src/runtime/contrib/mps/gemm.mm.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/conv.mm:36:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
   36 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)mtlbuf, 0, (__bridge void*)temp, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/conv.mm:72:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
   72 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)temp, 0, (__bridge void*)mtlbuf, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/conv.mm:106:53: error: no member named 'GetCommandQueue' in 'tvm::runtime::metal::MetalWorkspace'
  106 |   id<MTLCommandQueue> queue = entry_ptr->metal_api->GetCommandQueue(data->device);
      |                               ~~~~~~~~~~~~~~~~~~~~  ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/conv.mm:115:25: error: 'CopyDataFromTo' is a protected member of 'tvm::runtime::metal::MetalWorkspace'
  115 |   entry_ptr->metal_api->CopyDataFromTo((__bridge void*)bufB, 0, (__bridge void*)tempB, 0,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/contrib/mps/../../metal/metal_common.h:187:8: note: declared protected here
  187 |   void CopyDataFromTo(const void* from, size_t from_size, void* to, size_t to_size, size_t size,
      |        ^
4 errors generated.

1 thing I'm noticing, and would make sense as the crux to my issues, is I can't seem to get TVM to build with TVM_THREADPOOL_USE_OPENMP(sic) which'd seem pretty critical to TVM's ability to manage Metal/MPS' threading if I follow what that is doing correctly. Usually these errors are preceded by some "threading"

/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/threading_backend.cc:296:30: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant]
  296 |     SetThreadFullCpuAffinity(CURRENT_THREAD_HANDLE, mode);
      |                              ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/threading_backend.cc:51:77: note: expanded from macro 'CURRENT_THREAD_HANDLE'
   51 | #define CURRENT_THREAD_HANDLE (static_cast<std::thread::native_handle_type>(0))
      |                                                                             ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/threading_backend.cc:441:25: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant]
  441 |       SetThreadAffinity(CURRENT_THREAD_HANDLE,
      |                         ^
/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/threading_backend.cc:51:77: note: expanded from macro 'CURRENT_THREAD_HANDLE'
   51 | #define CURRENT_THREAD_HANDLE (static_cast<std::thread::native_handle_type>(0))
BuildBackBuehler commented 4 weeks ago

Yeah, seems that the source is when a module like ACL, MKL or AppleBLAS in general relies on that MPS gemm.mm file issues arise. Trying to think of anything else I could try, I used sudo and same results. I am able to compile without those add-ons.

Edit: Although, it seems I shouldn't have been able to compile whatsoever. I did so without some of the flags I'd normally tack on. While I doubt that'd make a huge difference, perhaps with the cache of errors, the latter builds were "ready" for that issue to come up and just let that error through initially.

Also, not sure if I had included this warning, but it always seems to head-off the error train.

/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/metal/metal_device_api.mm:52:13: warning: enumeration value 'kAvailableGlobalMemory' not handled in switch [-Wswitch]
   52 |     switch (kind) {
      |             ^~~~