Closed sjtu-scx closed 7 months ago
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the eos_token_id
is an integer in mlc_chat_config.json
, but it turns out that in your case the eos_token_id
is not. Could you help me check what the value of eos_token_id
is in dist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
?
Thanks for your patience in replying, here is my mlc-chat-config.json
file, I found that eos_token_id
is a list and not a value.
{ "model_type": "qwen2", "quantization": "q4f16_1", "model_config": { "hidden_act": "silu", "hidden_size": 4096, "intermediate_size": 11008, "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 32, "rms_norm_eps": 1e-06, "rope_theta": 1000000.0, "vocab_size": 151936, "context_window_size": 768, "prefill_chunk_size": 768, "tensor_parallel_shards": 1, "dtype": "float32" }, "vocab_size": 151936, "context_window_size": 768, "sliding_window_size": -1, "prefill_chunk_size": 768, "attention_sink_size": -1, "tensor_parallel_shards": 1, "mean_gen_len": 128, "max_gen_len": 512, "shift_fill_factor": 0.3, "temperature": 0.7, "presence_penalty": 0.0, "frequency_penalty": 0.0, "repetition_penalty": 1.05, "top_p": 0.8, "conv_template": "chatml", "pad_token_id": 151643, "bos_token_id": 151643, "eos_token_id": [ 151645, 151643 ], "tokenizer_files": [ "tokenizer.json", "vocab.json", "merges.txt", "tokenizer_config.json" ], "version": "0.1.0" }
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.
你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢?
我设置:
{
"model_url": "",
"model_lib": "qwen-2_q40f16",
"estimated_vram_bytes": 4348727787,
"model_id": "Qwen1.5-1.8B-Chat-q0f16"
}
mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \
--conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢?
我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版, --conv-template
设置为chatml
,不要用llama-2
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢? 我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版,
--conv-template
设置为chatml
,不要用llama-2
感谢感谢,请问model_lib的设置怎么做呢?qwen2_q40f16在部署的时候会出错
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢? 我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版,
--conv-template
设置为chatml
,不要用llama-2
感谢感谢,请问model_lib的设置怎么做呢?qwen2_q40f16在部署的时候会出错
不客气,你是做什么端的部署遇到了问题呀,mlc_chat gen_config 之后下一步就是编译文件到对应的device上了,也是命令行操作的,不需要额外设置
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢? 我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版,
--conv-template
设置为chatml
,不要用llama-2
感谢感谢,请问model_lib的设置怎么做呢?qwen2_q40f16在部署的时候会出错
不客气,你是做什么端的部署遇到了问题呀,mlc_chat gen_config 之后下一步就是编译文件到对应的device上了,也是命令行操作的,不需要额外设置
我在尝试安卓和IOS的部署,在生成apk的时候要指定一个model_lib,我遇到了和这里一样的问题https://github.com/mlc-ai/mlc-llm/issues/1517
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢? 我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版,
--conv-template
设置为chatml
,不要用llama-2
感谢感谢,请问model_lib的设置怎么做呢?qwen2_q40f16在部署的时候会出错
不客气,你是做什么端的部署遇到了问题呀,mlc_chat gen_config 之后下一步就是编译文件到对应的device上了,也是命令行操作的,不需要额外设置
我在尝试安卓和IOS的部署,在生成apk的时候要指定一个model_lib,我遇到了和这里一样的问题#1517
我跟着这个流程做了一下https://github.com/Tao-begd/mlc-llm-android,不知道对你是否有帮助
Thank you @sjtu-scx for reporting! The failure is due to we want to make sure that the
eos_token_id
is an integer inmlc_chat_config.json
, but it turns out that in your case theeos_token_id
is not. Could you help me check what the value ofeos_token_id
is indist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json
? When I change theeos_token_id
directly to151645
,the error disappears.你好,我遇到了一样的问题,也在尝试和你一样的模型,但是遇到了一个问题,请问app-config,json里面的model_lib和编译的时候的--conv-template是怎么设置的呢? 我设置:
{ "model_url": "", "model_lib": "qwen-2_q40f16", "estimated_vram_bytes": 4348727787, "model_id": "Qwen1.5-1.8B-Chat-q0f16" } mlc_chat gen_config ./dist/models/$MODEL_NAME/ --quantization $QUANTIZATION \ --conv-template llama-2 --context-window-size 768 -o dist/${MODEL_NAME}-${QUANTIZATION}-MLC/
会出问题
你好,Qwen用的是chatml模版,
--conv-template
设置为chatml
,不要用llama-2
感谢感谢,请问model_lib的设置怎么做呢?qwen2_q40f16在部署的时候会出错
不客气,你是做什么端的部署遇到了问题呀,mlc_chat gen_config 之后下一步就是编译文件到对应的device上了,也是命令行操作的,不需要额外设置
我在尝试安卓和IOS的部署,在生成apk的时候要指定一个model_lib,我遇到了和这里一样的问题#1517
我这边是这样设置的,把用不到的model删掉,然后添加自己的model并设置好路径
{ "model_list": [ { "model_url": "https://huggingface.co/mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC/", "model_lib": "llama_q4f16_1", "estimated_vram_bytes": 4348727787, "model_id": "Llama-2-7b-chat-hf-q4f16_1" }
], "model_lib_path_for_prepare_libs": { "llama_q4f16_1": "Llama-2-7b-chat-hf-q4f16_1-MLC\Llama-2-7b-chat-hf-q4f16_1-android.tar" } } 希望对你有帮助~
@MasterJH5574 maybe a good lessons is we should validate the generated mlc-chat-json for necessary field in gen_config
.
@sjtu-scx Thanks for sharing the config! Yes right now the ChatModule assumes the eos token id is a single token id, which does not hold for this case. We will work on a fix soon.
Fixed here https://github.com/mlc-ai/mlc-llm/pull/1940 by removing the need of eos_token_ids. Please wait for 1-2 days for the Pypi wheel updates.
Fixed here #1940 by removing the need of eos_token_ids. Please wait for 1-2 days for the Pypi wheel updates.
@MasterJH5574
If I set the eos_token_id
directly to a value, say 151645, instead of using the original list, and then recompile the tar file and package the apk file again, upon installing and running the qwen2 model on the phone, the entire system freezes and eventually crashes, requiring a phone restart. Have you encountered this issue before?
@MrRace Thanks for the question. Do you mean the result is caused by only changing the eos_token_id
?
Maybe we can follow up on this in a new issue. Also cc @Kartik14
The original issue should have been resolved. Closing this issue for now.
@MrRace Thanks for the question. Do you mean the result is caused by only changing the
eos_token_id
?Maybe we can follow up on this in a new issue. Also cc @Kartik14
@MasterJH5574 Thanks a lot for your reply. What I mean is, if we simply change the original list value of eos_token_id
to a single value, although it won't trigger the previous error: TVMError: Check failed: (config["eos_token_id"].is<int64_t>()) is false:
, but when using the input box to input dialogue text, it will cause system crashes and reboots on the mobile phone.
🐛 Bug TVMError: Check failed: (config["eos_token_id"].is()) is false:
When I compile qwen1.5-7B-Chat with chatml template, there is no problem with the compilation process, but when I call it, the following error appears: [2024-03-11 20:35:35] INFO auto_device.py:85: Not found device: cuda:0 [2024-03-11 20:35:35] INFO auto_device.py:85: Not found device: rocm:0 [2024-03-11 20:35:36] INFO auto_device.py:85: Not found device: metal:0 [2024-03-11 20:35:40] INFO auto_device.py:76: Found device: vulkan:0 [2024-03-11 20:35:41] INFO auto_device.py:85: Not found device: opencl:0 [2024-03-11 20:35:41] INFO auto_device.py:33: Using device: vulkan:0 [2024-03-11 20:35:41] INFO chat_module.py:373: Using model folder: C:\Users\sunchenxing\Desktop\mlc_new\dist\qwen1.5-7b-chat-q4f16_1-MLC [2024-03-11 20:35:41] INFO chat_module.py:374: Using mlc chat config: C:\Users\sunchenxing\Desktop\mlc_new\dist\qwen1.5-7b-chat-q4f16_1-MLC\mlc-chat-config.json [2024-03-11 20:35:41] INFO chat_module.py:516: Using library model: dist/libs/qwen1.5-7b-chat-q4f16_1-vulkan.dll [2024-03-11 20:35:42] INFO model_metadata.py:96: Total memory usage: 5058.70 MB (Parameters: 4142.95 MB. KVCache: 384.00 MB. Temporary buffer: 531.75 MB) [2024-03-11 20:35:42] INFO model_metadata.py:105: To reduce memory usage, tweak
cm = ChatModule(
File "C:\Users\sunchenxing.conda\envs\mlc\lib\site-packages\mlc_chat\chat_module.py", line 783, in init
self._reload(self.model_lib_path, self.model_path, user_chat_config_json_str)
File "C:\Users\sunchenxing.conda\envs\mlc\lib\site-packages\mlc_chat\chat_module.py", line 1002, in _reload
self._reload_func(lib, model_path, app_config_json)
File "C:\Users\sunchenxing.conda\envs\mlc\lib\site-packages\tvm_ffi_ctypes\packed_func.py", line 239, in call
raise_last_ffi_error()
File "C:\Users\sunchenxing.conda\envs\mlc\lib\site-packages\tvm_ffi\base.py", line 481, in raise_last_ffi_error
raise py_err
tvm._ffi.base.TVMError: Traceback (most recent call last):
File "D:\a\package\package\mlc-llm\cpp\llm_chat.cc", line 574
TVMError: Check failed: (config["eos_token_id"].is()) is false:
prefill_chunk_size
,context_window_size
andsliding_window_size
Traceback (most recent call last): File "C:\Users\sunchenxing\Desktop\mlc_new\test.py", line 5, inEnvironment
conda
, source): pippip
, source): pippython -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models): USE_NVTX: OFFUSE_GTEST: AUTO SUMMARIZE: OFF USE_IOS_RPC: OFF USE_MSC: OFF USE_ETHOSU: CUDA_VERSION: NOT-FOUND USE_LIBBACKTRACE: AUTO DLPACK_PATH: 3rdparty/dlpack/include USE_TENSORRT_CODEGEN: OFF USE_THRUST: OFF USE_TARGET_ONNX: OFF USE_AOT_EXECUTOR: ON BUILD_DUMMY_LIBTVM: OFF USE_CUDNN: OFF USE_TENSORRT_RUNTIME: OFF USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF USE_CCACHE: AUTO USE_ARM_COMPUTE_LIB: OFF USE_CPP_RTVM: USE_OPENCL_GTEST: /path/to/opencl/gtest USE_MKL: OFF USE_PT_TVMDSOOP: OFF MLIR_VERSION: NOT-FOUND USE_CLML: OFF USE_STACKVM_RUNTIME: OFF USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF ROCM_PATH: /opt/rocm USE_DNNL: OFF USE_VITIS_AI: OFF USE_MLIR: OFF USE_RCCL: OFF USE_LLVM: llvm-config --link-static USE_VERILATOR: OFF USE_TF_TVMDSOOP: OFF USE_THREADS: ON USE_MSVC_MT: OFF BACKTRACE_ON_SEGFAULT: OFF USE_GRAPH_EXECUTOR: ON USE_NCCL: OFF USE_ROCBLAS: OFF GIT_COMMIT_HASH: f06d486b4a1a27f0bbb072688a5fc41e7b15323c USE_VULKAN: ON USE_RUST_EXT: OFF USE_CUTLASS: OFF USE_CPP_RPC: OFF USE_HEXAGON: OFF USE_CUSTOM_LOGGING: OFF USE_UMA: OFF USE_FALLBACK_STL_MAP: OFF USE_SORT: ON USE_RTTI: ON GIT_COMMIT_TIME: 2024-03-08 02:04:22 -0500 USE_HEXAGON_SDK: /path/to/sdk USE_BLAS: none USE_ETHOSN: OFF USE_LIBTORCH: OFF USE_RANDOM: ON USE_CUDA: OFF USE_COREML: OFF USE_AMX: OFF BUILD_STATIC_RUNTIME: OFF USE_CMSISNN: OFF USE_KHRONOS_SPIRV: OFF USE_CLML_GRAPH_EXECUTOR: OFF USE_TFLITE: OFF USE_HEXAGON_GTEST: /path/to/hexagon/gtest PICOJSON_PATH: 3rdparty/picojson USE_OPENCL_ENABLE_HOST_PTR: OFF INSTALL_DEV: OFF USE_PROFILER: ON USE_NNPACK: OFF LLVM_VERSION: 17.0.6 USE_MRVL: OFF USE_OPENCL: OFF COMPILER_RT_PATH: 3rdparty/compiler-rt RANG_PATH: 3rdparty/rang/include USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF USE_OPENMP: OFF USE_BNNS: OFF USE_CUBLAS: OFF USE_METAL: OFF USE_MICRO_STANDALONE_RUNTIME: OFF USE_HEXAGON_EXTERNAL_LIBS: OFF USE_ALTERNATIVE_LINKER: AUTO USE_BYODT_POSIT: OFF USE_HEXAGON_RPC: OFF USE_MICRO: OFF DMLC_PATH: 3rdparty/dmlc-core/include INDEX_DEFAULT_I64: ON USE_RELAY_DEBUG: OFF USE_RPC: ON USE_TENSORFLOW_PATH: none TVM_CLML_VERSION: USE_MIOPEN: OFF USE_ROCM: OFF USE_PAPI: OFF USE_CURAND: OFF TVM_CXX_COMPILER_PATH: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.38.33130/bin/HostX64/x64/cl.exe HIDE_PRIVATE_SYMBOLS: OFF