SthPhoenix / InsightFace-REST

InsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.
Apache License 2.0
503 stars 117 forks source link

KeyError: 'score_8' in trt_backend.py (called by scrfd.py) when running demo scripts with scrfd_10g_gnkps and glintr100 #131

Closed ZayneHuang closed 6 months ago

ZayneHuang commented 6 months ago

Hi @SthPhoenix ,

Thanks for the wonderful work!

I am running the demo_client.py script as indicated in the repository. The models I am using is scrfd_10g_gnkps and glintr100 by default. The onnx file downloaded from the readme homepage mismatched the md5 check, so that I manually downloaded the weight files according to the config file and passed the md5 check.

However, I met the error message as following:

[2024-05-13 17:40:10 +0000] [96] [ERROR] Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/app/app.py", line 92, in extract
    output = await processing.extract(images, max_size=data.max_size, return_face_data=data.return_face_data,
  File "/app/modules/processing.py", line 92, in extract
    output = await self.model.embed(images, max_size=max_size, return_face_data=return_face_data,
  File "/app/modules/face_model.py", line 411, in embed
    faces_by_img = (e for e in await _get([img for img in imgs_iterable]))
  File "/app/modules/face_model.py", line 276, in get
    det_predictions = zip(*_partial_detect(batch_imgs))
  File "/app/modules/face_model.py", line 63, in detect
    bboxes, landmarks = self.retina.detect(data, threshold=threshold)
  File "/app/modules/model_zoo/detectors/scrfd.py", line 214, in detect
    net_outs = self._forward(blob)
  File "/app/modules/model_zoo/detectors/scrfd.py", line 302, in _forward
    net_outs = self.session.run(
  File "/app/modules/model_zoo/exec_backends/trt_backend.py", line 199, in run
    net_out = [net_out[e] for e in self.output_order]
  File "/app/modules/model_zoo/exec_backends/trt_backend.py", line 199, in <listcomp>
    net_out = [net_out[e] for e in self.output_order]
KeyError: 'score_8'

I tried to output the variables net_out and self.output_order in line 197 of trt_backend.py, and the result was:

[17:40:10] INFO - self.output_order: ['score_8', 'score_16', 'score_32', 'bbox_8', 'bbox_16', 'bbox_32', 'kps_8', 'kps_16', 'kps_32']
[17:40:10] INFO - net_out: {'451': array([[0.01507142]], dtype=float32), '454': array([[0.75878906, 0.94677734, 0.79248047, 1.1074219 ]], dtype=float32), '457': array([[-0.19628906, -0.27490234,  0.3779297 , -0.21911621,  0.05874634,
         0.18481445, -0.26293945,  0.46972656,  0.23632812,  0.51708984]],
      dtype=float32), '504': array([[0.01798534]], dtype=float32), '507': array([[1.6796875, 2.0957031, 1.5351562, 1.9804688]], dtype=float32), '510': array([[-0.66308594, -0.6245117 ,  0.58935547, -0.63427734, -0.05108643,
         0.14575195, -0.5678711 ,  0.80371094,  0.41552734,  0.8076172 ]],
      dtype=float32), '557': array([[0.01923281]], dtype=float32), '560': array([[2.3066406, 3.0117188, 2.1425781, 2.7910156]], dtype=float32), '563': array([[-0.8310547 , -0.68603516,  0.7270508 , -0.76708984,  0.12463379,
         0.26660156, -0.66015625,  1.1806641 ,  0.68847656,  1.1005859 ]],
      dtype=float32)}

I guess that there is a conflict output between different versions of detection models (since the error output indicates the call by scrfd.py), I am not sure the exact reason and how to fix it properly.

Hope that you could provide more advice on this error. Thanks!

SthPhoenix commented 6 months ago

Hi! Thats wierd, I have just double checked this - model seems to be loading as expected on clean image build. Have you added any modifications to the code or config before run?

ZayneHuang commented 6 months ago

Thanks for your help. The output has returned to normal after I cleared the previously generated TRT engines. I suspect that it was due to an incorrect ONNX model that resulted in an erroneous TRT engine and incorrect output.

Since the issue has been resolved, I have closed it.