axinc-ai / ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK
2.04k stars 325 forks source link

Implement opset 17 version for e5 #1475

Closed kyakuno closed 5 months ago

kyakuno commented 6 months ago
kyakuno commented 6 months ago

opset11

User (press q to exit): NNAPIとは何ですか。
Text: NNAPIの概要NNAPIはAndroidでNPU (Neural Processing Unit)を使用するためのローレベルAPIです。 (Similarity:0.898)

opset17

User (press q to exit): NNAPIとは何ですか。
Text: NNAPIの概要NNAPIはAndroidでNPU (Neural Processing Unit)を使用するためのローレベルAPIです。 (Similarity:0.899)

常にqueryタグ

User (press q to exit): NNAPIとは何ですか。
Text: NNAPIの概要NNAPIはAndroidでNPU (Neural Processing Unit)を使用するためのローレベルAPIです。 (Similarity:0.879)
kyakuno commented 6 months ago

1179

kyakuno commented 6 months ago

opset11 batch_size=72 sequense_size=77

ailia SDK 1.3.0 BLAS 7810.2ms MPS 1437.8ms

ailia SDK 1.4.0 BLAS 6833.0ms MPS 1012.8ms

kyakuno commented 6 months ago

normal

====Profile(Grouped by LayerType)====
LayerType   TotalInferTime(Average)[us] TimeRatio[%]
MatMul_DNN  587378  50.34
Eltwise_DNN 243872  20.90
UniversalGemm_DNN   139518  11.96
ReduceMean_DNN  92958   7.97
Transpose_DNN   63358   5.43
Gelu_DNN    18501   1.59
Softmax_DNN 12292   1.05
ConvertValue_DNN    3986    0.34
Eltwise 2170    0.19
Reshape_DNN 1431    0.12
Gather  1256    0.11
CumSum  63  0.01
Unsqueeze   29  0.00
ConvertValue    25  0.00
====Profile(Summary)====
Predict Average Time[us]:1166835    Variance:1255310734 N:5

opt


====Profile(Grouped by LayerType)====
LayerType   TotalInferTime(Average)[us] TimeRatio[%]
MatMul_DNN  574738  59.64
UniversalGemm_DNN   141382  14.67
Eltwise_DNN 110439  11.46
LayerNormalization_DNN  54530   5.66
Transpose_DNN   52991   5.50
Gelu_DNN    13978   1.45
Softmax_DNN 10811   1.12
Eltwise 2086    0.22
Reshape_DNN 1311    0.14
Gather  1225    0.13
CumSum  62  0.01
Unsqueeze   30  0.00
ConvertValue    17  0.00
====Profile(Summary)====
Predict Average Time[us]:963599 Variance:487987777  N:5