mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.49k stars 1.49k forks source link

[Bug] The performance accuracy of large models is severely lost after quantization on Qwen2-1.5B-Instruct ,please fix it #2568

Open Stephen888888 opened 2 months ago

Stephen888888 commented 2 months ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

1. 1. 1.

Expected behavior

Environment

Additional context

MasterJH5574 commented 2 months ago

Thank you for opening the issue. It would be more helpful if you don't mind describing or sharing more information on how you ran and what phenomenon you encountered.

Stephen888888 commented 2 months ago

Model is complie as below: mlc_llm convert_weight ./dist/models/Qwen2-1.5B/ --model-type qwen2 --quantization q4f16_1 -o dist/Qwen2-1.5B-q4f16_1-MLC mlc_llm gen_config ./dist/models/Qwen2-1.5B/ --model-type qwen2 --quantization q4f16_1 --conv-template chatml --context-window-size 2048 --max-batch-size 1 -o dist/Qwen2-1.5B-q4f16_1-MLC/

my mlc-package-config.json is as below: { "device": "android", "model_list": [ { "model": "/home/stephen/mlc-llm/dist/Qwen2-1.5B-Instruct-q4f16_1-MLC", "model_id": "Qwen2-1.5B-Instruct-q4f16_1-MLC", "estimated_vram_bytes": 3980990464, "bundle_weight": true } ] }

Then use mlc_llm package to build libtvm4j_runtime_packed.so and tvm4j_core.jar.

Then using Android Studio open android/MLCChat to build APK

Then using python bundle_weight.py --apk-path app/release/app-release.apk to install apk to meta50 which is Snapdragon 8 Gen 1 chip. it shows as below image

but the same apk install to MATE60 which is Kirin90000S chip Maleoon 910 GPU, it show OK, just slowing: image

Stephen888888 commented 2 months ago

log shows as below: 2024-06-12 16:43:11.778 10890-10890 HwRemoteIn...hodManager ai.mlc.mlcchat W isCasting false because IHwDistributedWindowManager is invalid. 2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I navBarColor: fffcfcfc statusBarColor: ff00668b statusInsets: Insets{left=0, top=91, right=0, bottom=0} navInsets: Insets{left=0, top=0, right=0, bottom=0} 2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:1 size: 0 showView:false color:fffcfcfc 2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:0 size: 91 showView:true color:ff00668b 2024-06-12 16:43:11.800 10890-10890 HwViewRootImpl ai.mlc.mlcchat I removeInvalidNode all the node in jank list is out of time 2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I navBarColor: fffcfcfc statusBarColor: ff00668b statusInsets: Insets{left=0, top=91, right=0, bottom=0} navInsets: Insets{left=0, top=0, right=0, bottom=0} 2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:1 size: 0 showView:false color:fffcfcfc 2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:0 size: 91 showView:true color:ff00668b 2024-06-12 16:43:12.568 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] start to get views' rect, type = SCENE_GESTURE_SINGLE_TAP 2024-06-12 16:43:12.570 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] windowModeType: 1 2024-06-12 16:43:12.570 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] displayPoint: Point(1088, 2400) 2024-06-12 16:43:12.570 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] windowModeType: 1 2024-06-12 16:43:12.570 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] lazyMode: 2024-06-12 16:43:12.570 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] current mode is full screen 2024-06-12 16:43:12.570 10890-10890 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] start to getViewHierarchy 2024-06-12 16:43:12.574 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] deviceOrientation: 0 2024-06-12 16:43:12.574 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner-ScreenDirection] ROTATION_0 2024-06-12 16:43:12.574 10890-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] get views' rect = 0, SCENE_GESTURE_SINGLE_TAP 2024-06-12 16:43:12.598 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.040 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.089 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.141 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.156 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.206 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.256 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.273 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled 2024-06-12 16:43:15.323 10890-10927 OpenGLRenderer ai.mlc.mlcchat I gpu complete fence is not signaled

Stephen888888 commented 2 months ago

@jeethu any problem found ?

Stephen888888 commented 1 month ago

any problem found ?

Stephen888888 commented 1 month ago

any problem found ?