quic / efficient-transformers

This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.
https://quic.github.io/efficient-transformers/
Other
39 stars 26 forks source link

QNN Compilation Support. #121

Open shubhagr-quic opened 2 weeks ago

shubhagr-quic commented 2 weeks ago
1. Infer/Compile API Changes to include --enable_qnn [Optional QNN Config File]
2. Added qnn_config.json file format.
3. Added generate_qnn_network_specialization_config.py to create custom_io_config.yaml file for QNN Compilation Step.
4. Changes to support Python 3.10 requirement for QNN
5. Modified compiler_constants.py & qnn_compiler.py to support QNN compilation.
ochougul commented 1 week ago

Please add how to use qnn compiler in readme