Closed tom-pollak closed 1 month ago
I'm currently working on trying to get snapdragon 8 gen 2 working. Will post an update when it works. Which chip are you looking to get support for?
I'm currently working on trying to get snapdragon 8 gen 2 working. Will post an update when it works. Which chip are you looking to get support for?
Thank @Abhishek8394 for spending time on this! I added 8gen2 support in the latest commit: https://github.com/saic-fi/MobileQuant/commit/084b62c0e41b9ebf2800cc0aecb6cf9ed231bad1
@tom-pollak the code was tested only on 8gen3 and 8gen2. These are the devices I have access to. Unfortunately, there is no plan to support more devices at this moment.
Most of the important modifications were in assets
and device/export.py
folder, which involved adding the config files, changing the chipset id (8650
-> 8550
), and the hexagon id (v75
-> v73
).
There were also api changes in the capp
code, but what you may want to care about were those in capp/src/qnn_context.cpp
to export a qnn model compatible with 8gen2, please use --device_type 8gen2
with the export.py
script:
CUDA_VISIBLE_DEVICES=0 python device/export.py --hf_path ${HF_PATH} \
--kv_cache --per_channel --use_conv --quant_config 4 8 32 \
--quant_encoding results/sim_${HF_NAME}_calibration/gen/model_gen_transfered.encodings \
--kv_encoding results/sim_${HF_NAME}_calibration/gen/model_gen_kv_cache.encodings \
--device_type 8gen2
to build the app,
make aarch64-android-8gen2
to run the app,
adb shell "cd /data/local/tmp/llm_8gen3_demo && LD_LIBRARY_PATH=. ./simple_app llama-1.1b-mobilequant-w8a8-s1024-e60-8gen3 8gen2"
Feel free to reopen the issue if there are any further questions.
The current code / instructions are specific to Snapdragon 8 Gen 3 NPU. I was wondering what modifications would be necessary to support other Snapdragon devices? Specifically:
Thanks for your help!