GreenWaves-Technologies / gap_sdk

SDK for Greenwaves Technologies' GAP8 IoT Application Processor
https://greenwaves-technologies.com/en/gap8-the-internet-of-things-iot-application-processor/
Apache License 2.0
138 stars 75 forks source link

Depth To Space Support and Transformer Softmax Speedup #415

Closed Thomacdebabo closed 5 months ago

Thomacdebabo commented 5 months ago

Hi, I am currently deploying a multitask vision model on gap9. However, I have two main issues:

  1. My model uses an operation called DepthToSpace which is not yet supported by the sdk which I believe would be great to have since it allows for fast and efficient upsampling.
  2. I am deploying an efficient attention module. It uses Softmax on a rather large tensor which causes massive slow downs (46% of the total operations of my network are used calculate the softmax). This can sped up significantly by simply using a LUT for the exponential calculations according to this paper. Given the high interest in transformers lately I think this is a problem worth solving :).

My model is written in pytorch and exported to onnx opset version 16

Here are examples of the upscaling and attention layers. onnx_files.zip (Unfortunately the onnx export of the attention head is quite messy, it is based on Segformer, adapted so it would run on gap9)

Hope this issue sparks some interest, and I am happy to provide more information if needed.

Thomacdebabo commented 5 months ago

This issue is related to gap9 so I moved it here