mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.14k stars 1.57k forks source link

[Bug] llama2 7b android compilation is giving "Can only handle constant size stack allocation for now" error #2282

Closed Ramees025 closed 5 months ago

Ramees025 commented 6 months ago

Model: https://huggingface.co/mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC

Compiling above model with cmd: " mlc_llm compile ./dist/Llama-2-7b-chat-hf-q4f16_1-MLC/mlc-chat-config.json --device android -o ./dist/Llama-2-7b-chat-hf-q4f16_1-MLC/llama-2-7b-chat-hf-q4f16_1-android.tar "

giving below errors:

File "/home/test/Ramees/relax/src/target/source/codegen_c.h", line 104, in tvm::codegen::CodeGenC::PrintStmt(tvm::tir::Stmt const&) void PrintStmt(const Stmt& n) { VisitStmt(n); } File "/home/test/Ramees/relax/src/target/source/codegenc.cc", line 989, in tvm::codegen::CodeGenC::VisitStmt(tvm::tir::AllocateNode const) ICHECK_GT(constant_size, 0) << "Can only handle constant size stack allocation for now"; tvm.error.InternalError: Traceback (most recent call last): 6: operator() at /home/test/Ramees/relax/src/driver/driver_api.cc:531 5: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&) at /home/test/Ramees/relax/src/driver/driver_api.cc:514 4: tvm::codegen::Build(tvm::IRModule, tvm::Target) at /home/test/Ramees/relax/src/target/codegen.cc:73 3: tvm::codegen::BuildOpenCL(tvm::IRModule, tvm::Target) at /home/test/Ramees/relax/src/target/source/codegen_opencl.cc:619 2: tvm::codegen::CodeGenC::AddFunction(tvm::GlobalVar const&, tvm::tir::PrimFunc const&) at /home/test/Ramees/relax/src/target/source/codegen_c.cc:167 1: tvm::codegen::CodeGenC::PrintStmt(tvm::tir::Stmt const&) at /home/test/Ramees/relax/src/target/source/codegenc.h:104 0: tvm::codegen::CodeGenC::VisitStmt(tvm::tir::AllocateNode const) at /home/test/Ramees/relax/src/target/source/codegen_c.cc:989 File "/home/test/Ramees/relax/src/target/source/codegen_c.cc", line 989 InternalError: Check failed: constant_size > 0 (0 vs. 0) : Can only handle constant size stack allocation for now

Any idea why it is happening?? @tqchen

tqchen commented 6 months ago

I am not sure what was happening on this case, perhaps it is related to come stale variant of compiler. we recently updated our android SDK https://llm.mlc.ai/docs/deploy/android.html please try follow the new instructions

DeclK commented 5 months ago

This is not just happening on the Android SDK, I am using this on a CUDA device, but this error still happens. I tried to use a built from source mlc_llm, the error would happen, but when I use the pre-built the mlc_llm, this error goes away. It is hard to locate where is the problem, can anyone give more hints @tqchen?

In my case, I already installed a pre-built mlc_llm & tvm-unity package in my environment, then I git clone the newest version of mlc_llm, and created a virtual environment, then built it from source, I tested a Qwen2-0.5B model with these these 2 versions of mlc_llm

mlc-ai-nightly-cu122      0.15.dev297
mlc_llm                   0.1.dev1231+gbc6e3edd /xxx/mlc-llm/python
mlc-llm-nightly-cu122     0.1.dev1145

--------------- Update ---------------- This is solved when I upgrade my tvm version to the latest pre-built package

mlc-ai-nightly-cu122      0.15.dev364

I guess the mlc_llm & tvm are closely tangled, there are some mismatch bewteen new version of mlc and old version of tvm

tqchen commented 5 months ago

Glad this is resolved. likely it was due to old ersion of the tvm