Closed dbasbabasi closed 2 years ago
Hi! Triton backend should work, but for now it's up to you to run separate Triton Server container and provide it's url to deploy_trt.sh
config.
Also, currently there is known issue with inference of SCRFD model with Triton backend - Triton provides outputs as non writable numpy array, but new optimized SCRFD post processing modifies net output arrays in place to avoid excessive creation of numpy arrays. For now it can be fixed by replacing lines 332-334 of scrfd.py
with:
score_blob = np.copy(net_outs[idx][0])
bbox_blob = np.copy(net_outs[idx + self.fmc][0])
kpss_blob = np.copy(net_outs[idx + self.fmc * 2][0])
Thank you for your quick response.
Actually I have some experience with Triton but there is a problem with getting the metadata during load the model and docker container is stopping automatically. I tried to debug it. But I couldn't fix it. I used following models and conf.
max_size=640,640 det_model=retinaface_r50_v1 rec_model=arcface_r100_v1
Docker logs:
Have you changed localhost to actual triton server IP:grpc port? In docker localhost is container itself, not the host machine.
Here is my deploy_trt file. Yeah I tried with host ip and also tried with localhost and open the port on docker run comment.
You shouldn't bind triton ports inside insightface-rest container, it should cause exceptions when starting triton server, or IFR container
Yeah I got it. I deleted ports, run inference docker after that run the deploy_trt. It looks detection model uploaded and I can see the model output list during the load but I got another error for Arcface. I am checking it. Thank you so much for your help.
IFR is using shared GPU memory to communicate with triton server, it may not work if triton is on different host.
Yeah it works on the same machine. I could send a face detection request to Triton. But when I tried to load face rec model, it is returning Cuda shared memory error.
Also I needed to change face detection request dimension for fixing it.
I have just checked - everything seems to be working using fix from https://github.com/SthPhoenix/InsightFace-REST/issues/60#issuecomment-972911459 I have followed these steps:
deploy_trt.sh
setting rec_batch_size
= 32 and det_batch_size
= 10{triton_models}/scrfd_10g_gnkps/1/model.plan
, {triton_models}/glintr100/1/model.plan
deploy_trt.sh
changing det_batch_size
to 1 and INFERENCE_BACKEND
to triton
and providing valid triton_uri
(your host machine local IP address)deploy_trt.sh
again.Though you should provide valid model configs to get use of dynamic batching.
Also keep in mind that creating shared memory regions actually uses additional GPU memory (about 110-150mb per worker), so ensure you have enough free GPU RAM
Thank you so much I used onnx model for triton. It works right now for retinaface and arcface. Do you have a plan adding to age gender for triton?
Gender/age model is now temporarily not supported, since g/a model requires different face crop preprocessing than current glintr100 recognition models.
I used retinaface resnet model for face detection. I will try to run g/a model. Thank you so much for your help. If you have a recommendation for g/a, I will be really glad, otherwise I will close this issue.
You could implement it, but you'll have to make copies of face crops numpy arrays at recognition step, otherwise g/a estimations will be totally wrong, due to different preprocessing required for recognition and g/a estimation. Copying numpy arrays will hit overall performance, though I haven't tested how much yet.
Thank you I used my own model for that as onnx. And write new client for this models. The result looks good. Your repo is awesome. Thank you so much for your help!
Nice to hear that! Have you used publicly available model for ga or have you trained your own?
I used my own trained models. I converted them to the onnx and write a new client for age/gender, emotion and mask detection. After the face crop, passed the cropped face to the inference. I see retina face had pretrained mask model but it looks unavailable right now.
I used my own trained models. I converted them to the onnx and write a new client for age/gender, emotion and mask detection. After the face crop, passed the cropped face to the inference. I see retina face had pretrained mask model but it looks unavailable right now.
Sorry for late reply, finally got some free time )
You have separate models for GA, emotion and mask detection working on 112x112 face crops? That's interesting since all pretrained models for this tasks I have seen were expecting different input shape. Could you point out where I could find training code or models if you have used public repos?
Hey Yeah GA model is separated. It is not public repo I can't share it. Our models are works with retinaface. I have no idea about public GA and mask models
Hi, It works on trt backend. I am trying to run it on the triton backend. I changed docker parameter in the deploy_trt file. It fails on warmup on triton backend. Do I need change another conf?