mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.15k stars 1.57k forks source link

[Bug] Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time. #2122

Closed dusihuaxin closed 6 months ago

dusihuaxin commented 7 months ago

🐛 Bug

I download model from huggingface which mlc-ai provided mlc-ai/Qwen1.5-MoE-A2.7B-Chat-q4f16_1-MLC. fiirst, i complie the mlc to dll use this command: mlc_llm compile E:\Qwen1.5-MoE-A2.7B-Chat-q4f16_1-MLC/mlc-chat-config.json --device vulkan -o E:\libs\Qwen1.5-MoE-A2.7B-Chat-q4f16_1-vulkan.dll

Although the compilation is completed, the following errors are issued in the compilation process: [18:26:27] D:\a\package\package\tvm\src\tir\ir\stmt.cc:122: InternalError: Check failed: (e.dtype().bits() <= loop_var.dtype().bits()) is false: Loop variable's dtype (int32) is narrower than that of min or extent (int64) Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[18:26:27] D:\a\package\package\tvm\src\tir\ir\stmt.cc:122: InternalError: Check failed: (e.dtype().bits() <= loop_var.dtype().bits()) is false: Loop variable's dtype (int32) is narrower than that of min or extent (int64) Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[18:26:27] D:\a\package\package\tvm\src\tir\ir\stmt.cc:122: InternalError: Check failed: (e.dtype().bits() <= loop_var.dtype().bits()) is false: Loop variable's dtype (int32) is narrower than that of min or extent (int64) Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time. I cut the following part: [2024-04-11 18:26:27] INFO pipeline.py:50: Lowering to VM bytecode [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function alloc_embedding_tensor: 16.00 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function batch_decode: 0.64 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function batch_prefill: 236.58 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function batch_verify: 2610.00 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function create_tir_paged_kv_cache: 0.00 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function decode: 0.64 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function embed: 16.00 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function prefill: 236.58 MB [2024-04-11 18:26:29] INFO estimate_memory_usage.py:57: [Memory usage] Function softmax_with_temperature: 0.00 MB [2024-04-11 18:26:29] INFO pipeline.py:50: Compiling external modules [2024-04-11 18:26:29] INFO pipeline.py:50: Compilation complete! Exporting to disk [2024-04-11 18:26:37] INFO model_metadata.py:96: Total memory usage: 3605.82 MB (Parameters: 995.82 MB. KVCache: 0.00 MB. Temporary buffer: 2610.00 MB) [2024-04-11 18:26:37] INFO model_metadata.py:105: To reduce memory usage, tweak prefill_chunk_size, context_window_size and sliding_window_size

Then, I use the following code to load two files and tell me that the model cache file does not exist. I found that the mistakes reported in the two places were consistent:DMLC_LOG_STACK_TRACE repeatedly appeared, but because I did not check your source code, I don't know where this error means something wrong.

cm = ChatModule( model="E:\Qwen1.5-MoE-A2.7B-Chat-q4f16_1-MLC", model_lib_path="E:\libs\Qwen1.5-MoE-A2.7B-Chat-q4f16_1-vulkan.dll" ) [18:28:11] D:\a\package\package\tvm\src\runtime\relax_vm\ndarray_cache_support.cc:333: ValueError: Cannot find parameter in cache: model.layers.0.mlp.gate_up_proj.q_weight Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

dusihuaxin commented 7 months ago

I think this problem can be put aside for the time being. I found that when converting the dll, there are two additional files (exp, lib). The model can be loaded and used normally. image

but the error during conversion still exists: ValueError: The block no longer exists in the IRModule Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

dusihuaxin commented 7 months ago

When I converted for the first time, for convenience, I put the mlc and dll files in the same folder. Because of the tutorial, I thought only the dll files were valid, and I didn't know the functions of the other extra files.

However, I still hope that the author can solve the error cause during win compilation. This time I followed the official documentation and used the qwen1.5-0.5b model in the entire process from mlc conversion to dll generation.

MasterJH5574 commented 6 months ago

Thank you @dusihuaxin for reporting. We will look into this.

tqchen commented 6 months ago

This was due to the prefill_chunk_size setting, reduce it would help the issue

dusihuaxin commented 6 months ago

这是由于prefill_chunk_size设置,减少它将有助于解决问题

Tk u for the infor