mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.12k stars 1.57k forks source link

[Bug] Qwen2-1.5B Q4F16_0 - libc++abi: terminating due to uncaught exception of type std::length_error: vector #2876

Closed digisomni closed 3 weeks ago

digisomni commented 2 months ago

🐛 Bug

I believe when the final token completes or is about to be completed in a request, the entire app crashes with libc++abi: terminating due to uncaught exception of type std::length_error: vector

To Reproduce

Steps to reproduce the behavior:

  1. Load an MLC Engine in Python with Qwen2-1.5B Q4F16_0
  2. Inference it with a structured request (JSON), use streaming so you can see how all the tokens do complete but before/around when the request finalizes, it crashes.

Expected behavior

MLC should not crash, which in turn crashes my whole Python application.

Environment

Ubospica commented 1 month ago

Hi @digisomni, thanks for reporting the error! Could you provide the complete error message and the script to reproduce the error so we can better identify the problem? I failed to reproduce this error on my device, but not certain if this is related the environment.