SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.
Apache License 2.0
501 stars 117 forks source link
adaface arcface centerface docker face-detection face-recognition fastapi fp16 gpu insightface mask-detection onnx retinaface scrfd tensorrt tensorrt-conversion yolov5-face

InsightFace-REST

WARNING: Latest update may cause troubles with previously compiled Numba functions. If you met any errors concerning 'modules not found' Run following command in repo root to remove __pycache__:

find . | grep -E "(__pycache__|\.pyc$)" | sudo xargs rm -rf

This repository aims to provide convenient, easy deployable and scalable REST API for InsightFace face detection and recognition pipeline using FastAPI for serving and NVIDIA TensorRT for optimized inference.

Code is heavily based on API code in official DeepInsight InsightFace repository.

This repository provides source code for building face recognition REST API and converting models to ONNX and TensorRT using Docker.

Draw detections example

Key features:

Contents

List of supported models

Prerequesites

Running with Docker

API usage

Work in progress

Known issues

Changelog

List of supported models:

Detection:

Model Auto download Batch inference Detection (ms) Inference (ms) GPU-Util (%) Source ONNX File
retinaface_r50_v1 Yes* 12.3 8.4 26 official package link
retinaface_mnet025_v1 Yes* 8.6 4.6 17 official package link
retinaface_mnet025_v2 Yes* 8.8 4.9 17 official package link
mnet_cov2 Yes* 8.7 4.6 18 mnet_cov2 link
centerface Yes 10.6 3.5 19 Star-Clouds/CenterFace link
scrfd_10g_bnkps Yes* Yes 3.3 2 16 SCRFD link
scrfd_2.5g_bnkps Yes* Yes 2.2 1.1 13 SCRFD link
scrfd_500m_bnkps Yes* Yes 1.9 0.8 13 SCRFD link
scrfd_10g_gnkps Yes* Yes 3.3 2.2 17 SCRFD** link
scrfd_2.5g_gnkps Yes* Yes 2.3 1.2 14 SCRFD** link
scrfd_500m_gnkps Yes* Yes 2.1 1.3 14 SCRFD** link
yolov5s-face Yes* Yes yolov5-face link
yolov5m-face Yes* Yes yolov5-face link
yolov5l-face Yes* Yes yolov5-face link

Note: Performance metrics measured on NVIDIA RTX2080 SUPER + Intel Core i7-5820K (3.3Ghz * 6 cores) for api/src/test_images/lumia.jpg with force_fp16=True, det_batch_size=1 and max_size=640,640.

Detection time include inference, pre- and postprocessing, but does not include image reading, decoding and resizing.

Note 2: SCRFD family models requires input image shape dividable by 32, i.e 640x640, 1024x768.

Recognition:

Model Auto download Batch inference Inference b=1 (ms) Inference b=64 (ms) Source ONNX File
arcface_r100_v1 Yes* Yes 2.6 54.8 official package link
r100-arcface-msfdrop75 No Yes - - SubCenter-ArcFace None
r50-arcface-msfdrop75 No Yes - - SubCenter-ArcFace None
glint360k_r100FC_1.0 No Yes - - Partial-FC None
glint360k_r100FC_0.1 No Yes - - Partial-FC None
glintr100 Yes* Yes 2.6 54.7 official package link
w600k_r50 Yes* Yes 1.9 33.2 official package link
w600k_mbf Yes* Yes 0.7 9.9 official package link
adaface_ir101_webface12m Yes* Yes - - AdaFace repo link

Other:

Model Auto download Inference code Source ONNX File
genderage_v1 Yes* Yes official package link
mask_detector Yes* Yes Face-Mask-Detection link
mask_detector112 Yes* Yes Face-Mask-Detection*** link
2d106det No No coordinateReg None

* - Models will be downloaded from Google Drive, which might be inaccessible in some regions like China.

** - custom models retrained for this repo. Original SCRFD models have bug (deepinsight/insightface#1518) with detecting large faces occupying >40% of image. These models are retrained with Group Normalization instead of Batch Normalization, which fixes bug, though at cost of some accuracy.

Models accuracy on WiderFace benchmark:

Model Easy Medium Hard
scrfd_10g_gnkps 95.51 94.12 82.14
scrfd_2.5g_gnkps 93.57 91.70 76.08
scrfd_500m_gnkps 88.70 86.11 63.57

*** - custom model retrained for 112x112 input size to remove excessive resize operations and improve performance.

Requirements:

  1. Docker
  2. Nvidia-container-toolkit
  3. Nvidia GPU drivers (470.x.x and above)

Running with Docker:

  1. Clone repo.
  2. Execute deploy_trt.sh from repo's root, edit settings if needed.
  3. Go to http://localhost:18081 to access documentation and try API

If you have multiple GPU's with enough GPU memory you can try running multiple containers by editing n_gpu and n_workers parameters in deploy_trt.sh.

By default container is configured to build TRT engines without FP16 support, to enable it change value of force_fp16 to True in deploy_trt.sh. Keep in mind, that your GPU should support fast FP16 inference (NVIDIA GPUs of RTX20xx series and above, or server GPUs like TESLA P100, T4 etc. ).

Also if you want to test API in non-GPU environment you can run service with deploy_cpu.sh script. In this case ONNXRuntime will be used as inference backend.

For pure MXNet based version, without TensorRT support you can check depreciated v0.5.0 branch

API usage:

For example of API usage example please refer to demo_client.py code.

Work in progress:

Known issues:

Changelog:

2021-11-06 v0.7.0.0

Since a lot of updates happened since last release version is updated straight to v0.7.0.0

Comparing to previous release (v0.6.2.0) this release brings improved performance for SCRFD based detectors.

Here is performance comparison on GPU Nvidia RTX 2080 Super for scrfd_10g_gnkps detector paired with glintr100 recognition model (all tests are using src/api_trt/test_images/Stallone.jpg, 1 face per image):

Num workers Client threads FPS v0.6.2.0 FPS v0.7.0.0 Speed-up
1 1 56 103 83.9%
1 30 72 128 77.7%
6 30 145 179 23.4%

Additions:

Model Zoo:

Improvements:

Fixes:

2021-09-09 v0.6.2.0

REST-API

2021-08-07 v0.6.1.0

REST-API

2021-06-16 v0.6.0.0

REST-API

2021-05-08 v0.5.9.9

REST-API

2021-03-27 v0.5.9.8

REST-API

REST-API & conversion scripts:

2021-03-01 v0.5.9.7

REST-API & conversion scripts:

2021-03-01 v0.5.9.6

REST-API:

2021-02-13

REST-API:

2020-12-26

REST-API & conversion scripts:

2020-12-26

REST-API:

Conversion scripts:

2020-12-06

REST-API:

Conversion scripts:

2020-11-20

REST API:

Conversion scripts:

2020-11-07

Conversion scripts:

REST API:

2020-10-22

Conversion scripts:

REST API:

TensorRT version contains MXNet and ONNXRuntime compiled for CPU for testing and conversion purposes.

2020-10-16

Conversion scripts:

REST API:

2020-09-28