Prepare quantized tflite models available on GPU/DSP/TPU - Githubissues

mrsnu / band

Multi-DNN Inference Engine for Heterogeneous Mobile Processors

Other

23 stars 2 forks source link

Prepare quantized tflite models available on GPU/DSP/TPU #12

Closed kdh0102 closed 3 years ago

kdh0102 commented 4 years ago

For more interesting experiments, models available on DSP, GPU are required.

As DSP needs INT8 quantized models (in case of GPU, INT8/FLOAT16), we need quantized tflite models:

Current available
- MobileNet V2
- Inception V2/V3
- YOLO, YOLO-tiny V3/V4
Need quantization
- Models in mediapipe repo
- Object Detection (SSD)

Note

~Succeeded in making INT8 quantized YOLOv4-tiny.tflite model.~ Quantization fails due to the minimum filter size in a conv. layer.

But,

3 SPLIT ops fall back to CPU, creating 4 GPU delegate ops.
DSP Delegate op is not generated. (needs more investigation.)

Even though quantization is done, the lack of op coverage can be a big obstacle.

kdh0102 commented 3 years ago

Refer to our Google Drive for available models. Maybe we can create a .tflite model zoo Github repo later.