mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.88k stars 1.54k forks source link

[Bug] Segmentation fault while building runtime and model libraries for Android #2922

Open iamlixiao opened 3 weeks ago

iamlixiao commented 3 weeks ago

🐛 Bug

I have been following the Android tutorial in official docs: https://llm.mlc.ai/docs/deploy/android.html

When performing Step 2. Build Runtime and Model Libraries, mlc_llm package command failed with the following output:

RuntimeError: Cannot find compilation output, compilation failed

I then ran the same compilation command in gdb, and found that the failure was caused by a segmentation fault in libtvm:

GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/lixiao/miniforge3/envs/mlc-llm/bin/python...
(gdb) run -m mlc_llm compile /home/lixiao/.cache/mlc_llm/model_weights/hf/mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC --opt 'flashinfer=1;cublas_gemm=1;faster_transformer=0;cudagraph=1;cutlass=1;ipc_allreduce_strategy=NONE' --overrides prefill_chunk_size=128 --device android --output /tmp/tmpq7mp2ini/lib.tar --system-lib-prefix phi3_q4f16_1_022af8ba80c36b0d4dc3cb6b36fdddf4_
Starting program: /home/lixiao/miniforge3/envs/mlc-llm/bin/python -m mlc_llm compile /home/lixiao/.cache/mlc_llm/model_weights/hf/mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC --opt 'flashinfer=1;cublas_gemm=1;faster_transformer=0;cudagraph=1;cutlass=1;ipc_allreduce_strategy=NONE' --overrides prefill_chunk_size=128 --device android --output /tmp/tmpq7mp2ini/lib.tar --system-lib-prefix phi3_q4f16_1_022af8ba80c36b0d4dc3cb6b36fdddf4_
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4bff640 (LWP 30970)]
[New Thread 0x7ffff43fe640 (LWP 30971)]
[New Thread 0x7ffff1bfd640 (LWP 30972)]
[New Thread 0x7fffef3fc640 (LWP 30973)]
[New Thread 0x7fffeabfb640 (LWP 30974)]
[New Thread 0x7fffe83fa640 (LWP 30975)]
[New Thread 0x7fffe5bf9640 (LWP 30976)]
[New Thread 0x7fffe33f8640 (LWP 30977)]
[New Thread 0x7fffe0bf7640 (LWP 30978)]
[New Thread 0x7fffde3f6640 (LWP 30979)]
[New Thread 0x7fffdbbf5640 (LWP 30980)]
[New Thread 0x7fffd93f4640 (LWP 30981)]
[New Thread 0x7fffd6bf3640 (LWP 30982)]
[New Thread 0x7fffd43f2640 (LWP 30983)]
[New Thread 0x7fffd1bf1640 (LWP 30984)]
[New Thread 0x7fffcf3f0640 (LWP 30985)]
[New Thread 0x7fffccbef640 (LWP 30986)]
[New Thread 0x7fffca3ee640 (LWP 30987)]
[New Thread 0x7fffc7bed640 (LWP 30988)]
[New Thread 0x7fffc53ec640 (LWP 30989)]
[New Thread 0x7fffc2beb640 (LWP 30990)]
[New Thread 0x7fffc03ea640 (LWP 30991)]
[New Thread 0x7fffbdbe9640 (LWP 30992)]
[New Thread 0x7fffbb3e8640 (LWP 30993)]
[New Thread 0x7fffb8be7640 (LWP 30994)]
[New Thread 0x7fffb63e6640 (LWP 30995)]
[New Thread 0x7fffb3be5640 (LWP 30996)]
[New Thread 0x7fffb13e4640 (LWP 30997)]
[New Thread 0x7fffaebe3640 (LWP 30998)]
[New Thread 0x7fffac3e2640 (LWP 30999)]
[New Thread 0x7fffa9be1640 (LWP 31000)]
[New Thread 0x7fffa73e0640 (LWP 31001)]
[New Thread 0x7fffa4bdf640 (LWP 31002)]
[New Thread 0x7fffa23de640 (LWP 31003)]
[New Thread 0x7fff9fbdd640 (LWP 31004)]
[2024-09-19 19:32:10] INFO auto_config.py:70: Found model configuration: /home/lixiao/.cache/mlc_llm/model_weights/hf/mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC/mlc-chat-config.json
[19:32:10] /workspace/tvm/src/target/parsers/aprofile.cc:97: Warning: Cannot parse target features for target: {"mtriple": "aarch64-linux-android", "kind": "llvm"}. LLVM was not compiled with support for Arm(R)-based targets.
[2024-09-19 19:32:10] INFO auto_config.py:154: Found model type: phi3. Use `--model-type` to override.
Compiling with arguments:
  --config          Phi3Config(model_type='phi3', hidden_size=3072, vocab_size=32064, num_hidden_layers=32, num_attention_heads=32, intermediate_size=8192, rms_norm_eps=1e-05, num_key_value_heads=32, max_position_embeddings=131072, position_embedding_base=10000.0, rope_scaling={'long_factor': [1.0800000429153442, 1.1100000143051147, 1.1399999856948853, 1.340000033378601, 1.5899999141693115, 1.600000023841858, 1.6200000047683716, 2.620000123977661, 3.2300000190734863, 3.2300000190734863, 4.789999961853027, 7.400000095367432, 7.700000286102295, 9.09000015258789, 12.199999809265137, 17.670000076293945, 24.46000099182129, 28.57000160217285, 30.420001983642578, 30.840002059936523, 32.590003967285156, 32.93000411987305, 42.320003509521484, 44.96000289916992, 50.340003967285156, 50.45000457763672, 57.55000305175781, 57.93000411987305, 58.21000289916992, 60.1400032043457, 62.61000442504883, 62.62000274658203, 62.71000289916992, 63.1400032043457, 63.1400032043457, 63.77000427246094, 63.93000411987305, 63.96000289916992, 63.970001220703125, 64.02999877929688, 64.06999969482422, 64.08000183105469, 64.12000274658203, 64.41000366210938, 64.4800033569336, 64.51000213623047, 64.52999877929688, 64.83999633789062], 'short_factor': [1.0, 1.0199999809265137, 1.0299999713897705, 1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0699999332427979, 1.0999999046325684, 1.1099998950958252, 1.1599998474121094, 1.1599998474121094, 1.1699998378753662, 1.2899998426437378, 1.339999794960022, 1.679999828338623, 1.7899998426437378, 1.8199998140335083, 1.8499997854232788, 1.8799997568130493, 1.9099997282028198, 1.9399996995925903, 1.9899996519088745, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0799996852874756, 2.0899996757507324, 2.189999580383301, 2.2199995517730713, 2.5899994373321533, 2.729999542236328, 2.749999523162842, 2.8399994373321533], 'type': 'longrope', 'rope_type': 'longrope', 'max_position_embeddings': 131072, 'original_max_position_embeddings': 4096}, original_max_position_embeddings=4096, context_window_size=131072, prefill_chunk_size=2048, head_dim=96, tensor_parallel_shards=1, max_batch_size=80, kwargs={})
  --quantization    GroupQuantize(name='q4f16_1', kind='group-quant', group_size=32, quantize_dtype='int4', storage_dtype='uint32', model_dtype='float16', linear_weight_layout='NK', quantize_embedding=True, quantize_final_fc=True, num_elem_per_storage=8, num_storage_per_group=4, max_int_value=7, tensor_parallel_shards=0)
  --model-type      phi3
  --target          {"thread_warp_size": runtime.BoxInt(1), "host": {"kind": "llvm", "tag": "", "keys": ["arm_cpu", "cpu"], "mtriple": "aarch64-linux-android"}, "texture_spatial_limit": runtime.BoxInt(16384), "max_threads_per_block": runtime.BoxInt(256), "max_function_args": runtime.BoxInt(128), "max_num_threads": runtime.BoxInt(256), "kind": "opencl", "max_shared_memory_per_block": runtime.BoxInt(16384), "tag": "", "keys": ["opencl", "gpu"]}
  --opt             flashinfer=0;cublas_gemm=0;faster_transformer=0;cudagraph=0;cutlass=0;ipc_allreduce_strategy=NONE
  --system-lib-prefix "phi3_q4f16_1_022af8ba80c36b0d4dc3cb6b36fdddf4_"
  --output          /tmp/tmpq7mp2ini/lib.tar
  --overrides       context_window_size=None;sliding_window_size=None;prefill_chunk_size=128;attention_sink_size=None;max_batch_size=None;tensor_parallel_shards=None;pipeline_parallel_stages=None
[2024-09-19 19:32:10] INFO config.py:107: Overriding prefill_chunk_size from 2048 to 128
[2024-09-19 19:32:10] INFO compile.py:140: Creating model from: Phi3Config(model_type='phi3', hidden_size=3072, vocab_size=32064, num_hidden_layers=32, num_attention_heads=32, intermediate_size=8192, rms_norm_eps=1e-05, num_key_value_heads=32, max_position_embeddings=131072, position_embedding_base=10000.0, rope_scaling={'long_factor': [1.0800000429153442, 1.1100000143051147, 1.1399999856948853, 1.340000033378601, 1.5899999141693115, 1.600000023841858, 1.6200000047683716, 2.620000123977661, 3.2300000190734863, 3.2300000190734863, 4.789999961853027, 7.400000095367432, 7.700000286102295, 9.09000015258789, 12.199999809265137, 17.670000076293945, 24.46000099182129, 28.57000160217285, 30.420001983642578, 30.840002059936523, 32.590003967285156, 32.93000411987305, 42.320003509521484, 44.96000289916992, 50.340003967285156, 50.45000457763672, 57.55000305175781, 57.93000411987305, 58.21000289916992, 60.1400032043457, 62.61000442504883, 62.62000274658203, 62.71000289916992, 63.1400032043457, 63.1400032043457, 63.77000427246094, 63.93000411987305, 63.96000289916992, 63.970001220703125, 64.02999877929688, 64.06999969482422, 64.08000183105469, 64.12000274658203, 64.41000366210938, 64.4800033569336, 64.51000213623047, 64.52999877929688, 64.83999633789062], 'short_factor': [1.0, 1.0199999809265137, 1.0299999713897705, 1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0699999332427979, 1.0999999046325684, 1.1099998950958252, 1.1599998474121094, 1.1599998474121094, 1.1699998378753662, 1.2899998426437378, 1.339999794960022, 1.679999828338623, 1.7899998426437378, 1.8199998140335083, 1.8499997854232788, 1.8799997568130493, 1.9099997282028198, 1.9399996995925903, 1.9899996519088745, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0799996852874756, 2.0899996757507324, 2.189999580383301, 2.2199995517730713, 2.5899994373321533, 2.729999542236328, 2.749999523162842, 2.8399994373321533], 'type': 'longrope', 'rope_type': 'longrope', 'max_position_embeddings': 131072, 'original_max_position_embeddings': 4096}, original_max_position_embeddings=4096, context_window_size=131072, prefill_chunk_size=2048, head_dim=96, tensor_parallel_shards=1, max_batch_size=80, kwargs={})
[2024-09-19 19:32:10] INFO compile.py:158: Exporting the model to TVM Unity compiler
[2024-09-19 19:32:12] INFO compile.py:164: Running optimizations using TVM Unity
[2024-09-19 19:32:12] INFO compile.py:185: Registering metadata: {'model_type': 'phi3', 'quantization': 'q4f16_1', 'context_window_size': 131072, 'sliding_window_size': -1, 'attention_sink_size': -1, 'prefill_chunk_size': 128, 'tensor_parallel_shards': 1, 'pipeline_parallel_stages': 1, 'kv_state_kind': 'kv_cache', 'max_batch_size': 80}
[2024-09-19 19:32:14] INFO pipeline.py:54: Running TVM Relax graph-level optimizations
[2024-09-19 19:32:16] INFO pipeline.py:54: Lowering to TVM TIR kernels
[2024-09-19 19:32:20] INFO pipeline.py:54: Running TVM TIR-level optimizations
[2024-09-19 19:32:32] INFO pipeline.py:54: Running TVM Dlight low-level optimizations
[2024-09-19 19:32:33] INFO pipeline.py:54: Lowering to VM bytecode
[2024-09-19 19:32:35] INFO estimate_memory_usage.py:58: [Memory usage] Function `alloc_embedding_tensor`: 0.75 MB
[2024-09-19 19:32:35] INFO estimate_memory_usage.py:58: [Memory usage] Function `batch_decode`: 6.72 MB
[2024-09-19 19:32:35] INFO estimate_memory_usage.py:58: [Memory usage] Function `batch_prefill`: 10.75 MB
[2024-09-19 19:32:35] INFO estimate_memory_usage.py:58: [Memory usage] Function `batch_verify`: 10.75 MB
[2024-09-19 19:32:35] INFO estimate_memory_usage.py:58: [Memory usage] Function `create_tir_paged_kv_cache`: 0.00 MB
[2024-09-19 19:32:36] INFO estimate_memory_usage.py:58: [Memory usage] Function `decode`: 0.08 MB
[2024-09-19 19:32:36] INFO estimate_memory_usage.py:58: [Memory usage] Function `embed`: 0.75 MB
[2024-09-19 19:32:36] INFO estimate_memory_usage.py:58: [Memory usage] Function `prefill`: 10.76 MB
[2024-09-19 19:32:36] INFO estimate_memory_usage.py:58: [Memory usage] Function `softmax_with_temperature`: 0.00 MB
[2024-09-19 19:32:37] INFO pipeline.py:54: Compiling external modules
[2024-09-19 19:32:37] INFO pipeline.py:54: Compilation complete! Exporting to disk

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fff976329e7 in tvm::codegen::CreateLLVMTargetMachine(llvm::Target const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::TargetOptions const&, llvm::Reloc::Model const&, llvm::CodeModel::Model const&, llvm::CodeGenOpt::Level const&) [clone .isra.0] () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
(gdb) bt
#0  0x00007fff976329e7 in tvm::codegen::CreateLLVMTargetMachine(llvm::Target const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::TargetOptions const&, llvm::Reloc::Model const&, llvm::CodeModel::Model const&, llvm::CodeGenOpt::Level const&) [clone .isra.0] () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#1  0x00007fff97636295 in tvm::codegen::LLVMTargetInfo::GetAllLLVMTargetArches() const () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#2  0x00007fff976378e3 in tvm::codegen::LLVMTargetInfo::LLVMTargetInfo(tvm::codegen::LLVMInstance&, tvm::runtime::Map<tvm::runtime::String, tvm::runtime::ObjectRef, void, void> const&) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#3  0x00007fff9763876d in tvm::codegen::LLVMTargetInfo::LLVMTargetInfo(tvm::codegen::LLVMInstance&, tvm::Target const&) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#4  0x00007fff97638bad in tvm::codegen::LLVMTarget::LLVMTarget(tvm::codegen::LLVMInstance&, tvm::Target const&) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#5  0x00007fff97644fd5 in tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, tvm::Target const&) () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#6  0x00007fff97647907 in tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::codegen::__mk_TVM0::{lambda(tvm::IRModule, tvm::Target)#1}>(tvm::codegen::__mk_TVM0::{lambda(tvm::IRModule, tvm::Target)#1}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue) () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#7  0x00007fff96c96642 in tvm::codegen::Build(tvm::IRModule, tvm::Target) () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#8  0x00007fff95ab4123 in tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#9  0x00007fff95ab7914 in tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)>::AssignTypedLambda<tvm::__mk_TVM24::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#1}>(tvm::__mk_TVM24::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#1}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#10 0x00007fff976cbdea in TVMFuncCall () from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/libtvm.so
#11 0x00007fff943204d5 in __pyx_f_3tvm_4_ffi_4_cy3_4core_FuncCall(void*, _object*, TVMValue*, int*) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so
#12 0x00007fff94320d78 in __pyx_pw_3tvm_4_ffi_4_cy3_4core_14PackedFuncBase_5__call__(_object*, _object*, _object*) ()
   from /home/lixiao/miniforge3/envs/mlc-llm/lib/python3.11/site-packages/tvm/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so
#13 0x000055555572a85b in _PyObject_MakeTpCall (tstate=0x555555ac49f8 <_PyRuntime+166328>, callable=0x7ffea335e760, args=<optimized out>, nargs=2, keywords=0x0)
    at /usr/local/src/conda/python-3.11.10/Objects/call.c:214
#14 0x0000555555737ee7 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:4769
#15 0x00005555557eeffd in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff7ad11b8, tstate=0x555555ac49f8 <_PyRuntime+166328>)
    at /usr/local/src/conda/python-3.11.10/Include/internal/pycore_ceval.h:73
#16 _PyEval_Vector (tstate=0x555555ac49f8 <_PyRuntime+166328>, func=0x7ffff7a876a0, locals=<optimized out>, args=0x0, argcount=0, kwnames=0x0)
    at /usr/local/src/conda/python-3.11.10/Python/ceval.c:6434
#17 0x00005555557ee73f in PyEval_EvalCode (co=0x7ffff7bff6d0, globals=<optimized out>, locals=0x7ffff7c2a600) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:1148
#18 0x0000555555805081 in builtin_exec_impl (module=<optimized out>, closure=<optimized out>, locals=0x7ffff7c2a600, globals=0x7ffff7c2a600, source=0x7ffff7bff6d0)
    at /usr/local/src/conda/python-3.11.10/Python/bltinmodule.c:1077
#19 builtin_exec (module=<optimized out>, args=<optimized out>, nargs=2, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.10/Python/clinic/bltinmodule.c.h:465
#20 0x0000555555744b1f in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0x7ffff7bc0f90, args=0x7ffff7ad1180, nargsf=<optimized out>, kwnames=0x0)
    at /usr/local/src/conda/python-3.11.10/Include/cpython/methodobject.h:52
#21 0x0000555555744a0c in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=0x7ffff7bc0f90, tstate=0x555555ac49f8 <_PyRuntime+166328>)
    at /usr/local/src/conda/python-3.11.10/Include/internal/pycore_call.h:92
#22 PyObject_Vectorcall (callable=0x7ffff7bc0f90, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.10/Objects/call.c:299
#23 0x0000555555737ee7 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:4769
#24 0x000055555575cb8f in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff7ad1020, tstate=0x555555ac49f8 <_PyRuntime+166328>)
    at /usr/local/src/conda/python-3.11.10/Include/internal/pycore_ceval.h:73
#25 _PyEval_Vector (kwnames=<optimized out>, argcount=2, args=0x7ffff7a71318, locals=0x0, func=0x7ffff7a87380, tstate=0x555555ac49f8 <_PyRuntime+166328>)
    at /usr/local/src/conda/python-3.11.10/Python/ceval.c:6434
#26 _PyFunction_Vectorcall (func=0x7ffff7a87380, stack=0x7ffff7a71318, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.10/Objects/call.c:393
#27 0x0000555555817998 in pymain_run_module (modname=<optimized out>, set_argv0=1) at /usr/local/src/conda/python-3.11.10/Modules/main.c:300
#28 0x0000555555817372 in pymain_run_python (exitcode=0x7fffffffd700) at /usr/local/src/conda/python-3.11.10/Modules/main.c:599
#29 Py_RunMain () at /usr/local/src/conda/python-3.11.10/Modules/main.c:684
#30 0x00005555557de9c7 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.11.10/Modules/main.c:738
--Type <RET> for more, q to quit, c to continue without paging--
#31 0x00007ffff7cbcd90 in __libc_start_call_main (main=main@entry=0x5555557de920 <main>, argc=argc@entry=15, argv=argv@entry=0x7fffffffd958) at ../sysdeps/nptl/libc_start_call_main.h:58
#32 0x00007ffff7cbce40 in __libc_start_main_impl (main=0x5555557de920 <main>, argc=15, argv=0x7fffffffd958, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffd948) at ../csu/libc-start.c:392
#33 0x00005555557de87a in _start ()

To Reproduce

Steps to reproduce the behavior:

1.Follow the android tutorial, install dependencies, set environment variables, etc.. 2.run mlc_llm package

Expected behavior

Model compiles, and the ./dist/ directory contain outputs as the tutorial states.

Environment

james-banks commented 6 days ago

I have the exact same issue attempting to follow the same guide, with the ROCm 6.2 version, on native Ubuntu 24.04, Python 3.11, both in Conda and with a virtual environment.

Q-point commented 9 hours ago

I observe the same issue. Android instructions when using ROCM 6.2 fails when calling mlc_llm package