SKKU-ESLAB / Auto-Compression

Automatic DNN compression tool with various model compression and neural architecture search techniques
MIT License
20 stars 20 forks source link

[Notion] XNNPACK sparse inference #54

Closed alpakaMK2 closed 1 year ago

alpakaMK2 commented 1 year ago

XNNPACK provide sparse inference using sparse kernel on multiple platform. but sparse inference of XNNPACK is limited.

  1. your TFLite model follow sub-graph rule to use Sparse inference specially, MobileNet v1 provided by keras (i.e. tensorflow.keras.application.MobileNet) doesn't follow sub-graph rule to use sparse inference. if you use Sparse MobileNet v1, Use FC layer and no dimemsion keeping avgpool, last convolution is replaced by FC layer

  2. Model sparsity (#nnz / sum(prunable layer weight) need to go over 66% if you use sparse model at any model sparsity, you must change XNNPACK sub-graph code, and tensorflow backend compiler .