zhouyuchong / face-recognition-deepstream

Deepstream app use retinaface and arcface for face recognition.
MIT License
60 stars 13 forks source link

Which deepstream and tensorrt did you use in this #4

Closed qustions closed 2 years ago

qustions commented 2 years ago

Hello @zhouyuchong I am trying to run the app getting this erro

ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /opt/nvidia/deepstream/deepstream-6.0/sources/apps/deepstream_python_apps/models/arcface/arcface-r100.engine

My configuration Docker deepstream-6 Tensorrt 8.0 I even tried to generate the engine file as mention in the readme this.I have successfully generated .wts but when i am doing make its giving me error

/tensorrtx/arcface/prelu.h(31): error: member function declared with "override" does not override a base class member
/tensorrtx/arcface/prelu.h(64): error: exception specification for virtual function "nvinfer1::PReluPlugin::detachFromContext" is incompatible with that of overridden function "nvinfer1::IPluginV2Ext::detachFromContext"

any idea how to solve this or generate the weight file for my configuration

zhouyuchong commented 2 years ago

Hi, there. Yes, I've met the same problem before. Seems it is because the low version of trt in tensorrtx repo. I add some codes to solve this problem, you can find them here. By the way, if there are still some errors, try to compare to the sub-folder yolov5 in tensorrtx-repo. Because yolov5 is the most used and it is maintained well by the owner.

qustions commented 2 years ago

Thanks @zhouyuchong that issue has been solved. I have few questions

  1. how to generate libArcFaceDecoder.so, libYoloV5Decoder.so, libRetinafaceDecoder.so I believe that there should me some c code for that?
  2. How to print extract 512 features this var I tried to add print on line 162 but figured out that this while loop is not getting triggered but its printing
    gstname= video/x-raw
    features= <Gst.CapsFeatures object at 0x7f48f9ab22e0 (GstCapsFeatures at 0x7f4684067b40)>
    ========================== FRAME 244 ============================
    face 1 of person-0 detected
  3. Did you tried to run this in deepstream-test3 C app not python, when i tried to run it its giving this error
    
    ERROR: [TRT]: 3: getPluginCreator could not find plugin: Decode_TRT version: 1
    ERROR: [TRT]: 1: [pluginV2Runner.cpp::load::291] Error Code 1: Serialization (Serialization assertion creator failed.Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
    ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.)
    ERROR: Deserialize engine failed from file: /opt/nvidia/deepstream/deepstream-6.0/sources/apps/deepstream_python_apps/models/retinaface/retina_r50.engine
    0:00:00.894043339 62690 0x556844c66070 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 2]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.0/sources/apps/deepstream_python_apps/models/retinaface/retina_r50.engine failed
    0:00:00.894084022 62690 0x556844c66070 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 2]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.0/sources/apps/deepstream_python_apps/models/retinaface/retina_r50.engine failed, try rebuild
    0:00:00.894092376 62690 0x556844c66070 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 2]: Trying to create engine from model files
    ERROR: failed to build network since there is no model file matched.
    ERROR: failed to build network.
    0:00:00.894407489 62690 0x556844c66070 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1934> [UID = 2]: build engine file failed
    0:00:00.894425041 62690 0x556844c66070 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2020> [UID = 2]: build backend context failed
    0:00:00.894431913 62690 0x556844c66070 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 2]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1257> [UID = 2]: generate backend failed, check config file settings
    0:00:00.894606434 62690 0x556844c66070 WARN                 nvinfer gstnvinfer.cpp:841:gst_nvinfer_start:<primary-nvinference-engine> error: Failed to create NvDsInferContext instance
    0:00:00.894613261 62690 0x556844c66070 WARN                 nvinfer gstnvinfer.cpp:841:gst_nvinfer_start:<primary-nvinference-engine> error: Config file path: config_retinaface.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
    Running...
    ERROR from element primary-nvinference-engine: Failed to create NvDsInferContext instance
    Error details: gstnvinfer.cpp(841): gst_nvinfer_start (): /GstPipeline:dstest3-pipeline/GstNvInfer:primary-nvinference-engine:
    Config file path: config_retinaface.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
    Returned, stopping playback
    Deleting pipeline
zhouyuchong commented 2 years ago

@qustions For Q1, yes, there are some c codes for the shard libs in tensorrtx repo. They are actually used to decode the output layer of original trt. Follow the official tensorrtx to generate them. For example, yolov5, in CMakeLists line 28. For Q2, I try to run the codes on my pc and it works fine

gstname= video/x-raw
features= <Gst.CapsFeatures object at 0x7f05ff25ba00 (GstCapsFeatures at 0x7f045c00a840)>
In cb_newpad

gstname= audio/x-raw
========================== FRAME 144 ============================
face 48 of person-6 detected
get facial features of person 6
========================== FRAME 181 ============================
face 52 of person-11 detected
get facial features of person 11
========================== FRAME 224 ============================
face 62 of person-32 detected
get facial features of person 32
========================== FRAME 229 ============================
face 64 of person-55 detected
get facial features of person 55
========================== FRAME 236 ============================
face 66 of person-60 detected
get facial features of person 60
========================== FRAME 240 ============================
face 68 of person-70 detected

and I can also get the 512d output data successfully. Maybe you could check your codes and let me know the details.

BTW, the main branch is not precise and the process-mode-2-version is now fixed. I will update if I find some spare time.

qustions commented 2 years ago

The Problem was person was compare_coordinates(obj_meta, PERSON_DETECTED[key][0]) was False for some reason so I have commented the code and made it True to do inference on every face and was able to get 512d.

# if PERSON_DETECTED[key][1] is None:
# if compare_coordinates(obj_meta, PERSON_DETECTED[key][0]) == True:
if True:

When doing euclidean distance on 512d the distance is very high and I am using the model arcface-r100.engine

#where a is the know person 512d 
dst = distance.euclidean(a, normal_array)
print(dst)

Tried with default insightface the distance is low, maybe there is a pre-processing (Like face alignment with rerinaface 5 points ) involved due to which the accuracy of the model is getting difference or something else? for precise deepstream distance output is 0.8-1.1 and default output is 0.1 - 0.6 on the same video. Same accuracy issue with retinaface few faces are not been detected sometimes compare to default retanface

Could you try it on your end in deepstream
steps to check

  1. Take a video or webcam of a single person
  2. Save the 512d in txt or piikle
  3. add the var a = extracted list (only 512)
  4. take a new video of the same person or the same video for testing
    #where a is the know person 512d 
    dst = distance.euclidean(a, normal_array)
    print(dst)

    check the distance

zhouyuchong commented 2 years ago

@qustions Yes! You are right, there should and have to be a face alignment before arcface. unfortunately, there is no existing way to do align preprocess in deepstream. I guess that's the reason causing the large distance between faces of a single same person. I'm trying to do face alignment preprocess at present, it needs to modify source codes of nvinfer or nvdspreprocess. Once I finished, I'll upload the custom plugin.

qustions commented 2 years ago

@zhouyuchong let me know if you need any help in developing the face aliment to speed up the process. I have added the face comparison in python

zhouyuchong commented 2 years ago

@qustions I have almost finished the alignment demo, you can find it here. the plugin now could only work on the retinaface trt version tensorrtx, the codes still needs to be optimized. you can check them and add your comparison together to see the performance :-). And if you have time, you could optimized the codes as well as send a pull request.

qustions commented 2 years ago

@zhouyuchong getting Segmentation fault (core dumped) with after make and make install the plugin this and have changed the configs of retinaface added output-tensor-meta=1 and in config_arcface.txt added

alignment=1
user-meta=1
output-tensor-meta=1

this is the error

gstname= video/x-raw
features= <Gst.CapsFeatures object at 0x7ff604cf99a0 (GstCapsFeatures at 0x7ff380066120)>
face of person-0 detected in frame3
Segmentation fault (core dumped)

when I added alignment=1 user-meta=1 below network-type=1 getting this error

face of person-12 detected in frame67
*******Now trans object tracker id:23****
***************************************************************************check finish.
face of person-23 detected in frame238
Segmentation fault (core dumped)

after debug i think its crashing after tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data) is there anything else i am missing?

zhouyuchong commented 2 years ago

@qustions I strongly suggest you to update deepstream to 6.1, Ver 6.0 and 6.0.1 have lots of bugs. I met a lot of Segmentation fault (core dumped) before which is hard to deal with. I tried with local file and rtsp source, both work fine. Still working on the align preprocess, let me know if you met any problem.

In cb_newpad

gstname= video/x-raw
features= <Gst.CapsFeatures object at 0x7f5b8b4f3220 (GstCapsFeatures at 0x7f59bc0c7aa0)>
*******Now trans object tracker id:17****
check finish.
face of person-17 detected in frame10
get facial features of person 17
*******Now trans object tracker id:53****
check finish.
face of person-53 detected in frame3528
get facial features of person 53
*******Now trans object tracker id:61****
check finish.
face of person-61 detected in frame6216
get facial features of person 61
*******Now trans object tracker id:68****
check finish.
face of person-68 detected in frame6424
get facial features of person 68
*******Now trans object tracker id:93****
check finish.
face of person-93 detected in frame8804
get facial features of person 93
*******Now trans object tracker id:100****
check finish.
face of person-100 detected in frame9026
get facial features of person 100
*******Now trans object tracker id:101****
check finish.
face of person-101 detected in frame9052
get facial features of person 101
*******Now trans object tracker id:111****
check finish.
face of person-111 detected in frame9906
get facial features of person 111
*******Now trans object tracker id:127****
check finish.
face of person-127 detected in frame11158
get facial features of person 127
*******Now trans object tracker id:182****
check finish.
face of person-182 detected in frame19234
get facial features of person 182
*******Now trans object tracker id:205****
check finish.
face of person-205 detected in frame21268
get facial features of person 205
*******Now trans object tracker id:208****
check finish.
face of person-208 detected in frame21522
get facial features of person 208
qustions commented 2 years ago

@zhouyuchong I have shifted from deepstream 6.0 to 6.1 its working fine without error. but this time distance is very low for all faces and osd is also not showing persons properly

zhouyuchong commented 2 years ago

@qustions For osd, it is right to show a tracked person for only few frames. I delete people whose faces have been recognized to avoid the pipeline get stuck. Because the arcface can only deal with maybe 30 faces per second, if crowded without deletion, the whole pipeline will freeze. For the distance, I don't think you should use euclidean distance. maybe cosine similarity is better. I'll check it later.

qustions commented 2 years ago

@zhouyuchong thanks for conforming. I have even tried cosine and faiss but there distance between two same person is have a lot of distance. What I have done is used the default model of the arc face and saved the 512 points and 2. used the tensorrt engine from deepstream with the same image but getting was different 512 points. Maybe the arcface model input is in different format.

qustions commented 2 years ago

@zhouyuchong Finally solved the issue the problem was with r100 model but with mobilenet model its working fine don't know the issue whats wrong with r100 model let me know if you find anything on r100 closing this issue now.

zhouyuchong commented 2 years ago

@qustions Great! Glad to hear that!