usefulsensors / qc_npu_benchmark

Code sample showing how to run and benchmark models on Qualcomm's Window PCs
Apache License 2.0
87 stars 5 forks source link

Sharing Results / Logs #3

Open deathlyrage opened 1 month ago

deathlyrage commented 1 month ago

Snapdragon Elite X Dev Kit:

PS C:\qc_npu_benchmark> py benchmark_matmul.py
Error in cpuinfo: Unknown chip model name 'Snapdragon(R) X Elite - X1E001DE - Qualcomm(R) Oryon(TM) CPU'.
Please add new Windows on Arm SoC/chip support to arm/windows/init.c!
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
WARNING:root:Please consider pre-processing before quantization. See https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
Starting stage: Graph Preparation Initializing
Completed stage: Graph Preparation Initializing (430 us)
Starting stage: Graph Transformations and Optimizations
Completed stage: Graph Transformations and Optimizations (208903 us)
Starting stage: Graph Sequencing for Target
 [##################################################] 100%
Completed stage: Graph Sequencing for Target (430831 us)
Starting stage: VTCM Allocation
Completed stage: VTCM Allocation (904480 us)
Starting stage: Parallelization Optimization
Completed stage: Parallelization Optimization (93656 us)
Starting stage: Finalizing Graph Sequence
Completed stage: Finalizing Graph Sequence (293467 us)
Starting stage: Completion
Completed stage: Completion (921 us)
Starting stage: Graph Preparation Initializing
Completed stage: Graph Preparation Initializing (307 us)
Starting stage: Graph Transformations and Optimizations
Completed stage: Graph Transformations and Optimizations (203105 us)
Starting stage: Graph Sequencing for Target
 [##################################################] 100%
Completed stage: Graph Sequencing for Target (504907 us)
Starting stage: VTCM Allocation
Completed stage: VTCM Allocation (1105806 us)
Starting stage: Parallelization Optimization
Completed stage: Parallelization Optimization (120234 us)
Starting stage: Finalizing Graph Sequence
Completed stage: Finalizing Graph Sequence (174437 us)
Starting stage: Completion
Completed stage: Completion (1156 us)
************ Benchmark Results ************
NPU quantized compute, float I/O accuracy difference is 0.0100
NPU quantized compute and I/O accuracy difference is 0.0060
CPU took 8.24ms, 839,048,954,786 ops per second
NPU (quantized compute, float I/O) took 30.14ms, 229,337,363,916 ops per second
NPU (quantized compute and I/O) took 11.89ms, 581,509,287,932 ops per second
PS C:\qc_npu_benchmark>
deathlyrage commented 1 month ago

Lenovo Snapdragon Laptop (Thinkpad)


Error in cpuinfo: Unknown chip model name 'Snapdragon(R) X Elite - X1E78100 - Qualcomm(R) Oryon(TM) CPU'.
Please add new Windows on Arm SoC/chip support to arm/windows/init.c!
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
unknown Qualcomm CPU part 0x1 ignored
WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
WARNING:root:Please consider pre-processing before quantization. See https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
Starting stage: Graph Preparation Initializing
Completed stage: Graph Preparation Initializing (360 us)
Starting stage: Graph Transformations and Optimizations
Completed stage: Graph Transformations and Optimizations (234225 us)
Starting stage: Graph Sequencing for Target
 [##################################################] 100%
Completed stage: Graph Sequencing for Target (470254 us)
Starting stage: VTCM Allocation
Completed stage: VTCM Allocation (1130989 us)
Starting stage: Parallelization Optimization
Completed stage: Parallelization Optimization (108373 us)
Starting stage: Finalizing Graph Sequence
Completed stage: Finalizing Graph Sequence (369914 us)
Starting stage: Completion
Completed stage: Completion (1308 us)
Starting stage: Graph Preparation Initializing
Completed stage: Graph Preparation Initializing (3549 us)
Starting stage: Graph Transformations and Optimizations
Completed stage: Graph Transformations and Optimizations (284238 us)
Starting stage: Graph Sequencing for Target
 [##################################################] 100%
Completed stage: Graph Sequencing for Target (502395 us)
Starting stage: VTCM Allocation
Completed stage: VTCM Allocation (872166 us)
Starting stage: Parallelization Optimization
Completed stage: Parallelization Optimization (70095 us)
Starting stage: Finalizing Graph Sequence
Completed stage: Finalizing Graph Sequence (110545 us)
Starting stage: Completion
Completed stage: Completion (1073 us)
************ Benchmark Results ************
NPU quantized compute, float I/O accuracy difference is 0.0100
NPU quantized compute and I/O accuracy difference is 0.0060
CPU took 9.60ms, 719,947,582,723 ops per second
NPU (quantized compute, float I/O) took 30.43ms, 227,119,478,686 ops per second
NPU (quantized compute and I/O) took 11.59ms, 596,241,063,859 ops per second```