NVIDIA-AI-IOT / face-mask-detection

Face Mask Detection using NVIDIA Transfer Learning Toolkit (TLT) and DeepStream for COVID-19
MIT License
241 stars 95 forks source link

deepstream detect nothing #11

Open XiaoPengZong opened 3 years ago

XiaoPengZong commented 3 years ago

Hi, I want deploy the model to deepstream. I have evaluated the training model, and get the result of below

class name      average precision (in %)
------------  --------------------------
mask                             87.3164
no-mask                          79.17

And I test image with tlt-infer tool, the result is acceptable though some mask are not detected. However when I deploy it in deepstream, it cant detect anything. First I use tlt-export tool get the etlt model named "model-48500.etlt", and copy it to deepsteam config path.

config_infer_primary_masknet_gpu.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=YjlxOTRkaHRjYWI2Z2NxN2cwOXBlZjh1OTQ6ZTE2YjdkNzctMmQ0OS00MDZhLTgzMGMtNjc5ZTIyZGNkNzA1
tlt-encoded-model=model-48500.etlt
labelfile-path=labels_masknet.txt
# GPU Engine File
model-engine-file=model-48500.etlt_b1_gpu0_fp16.engine
# DLA Engine File
# model-engine-file=/home/nvidia/detectnet_v2_models/detectnet_4K-fddb-12/resnet18_RGB960_detector_fddb_12_int8.etlt_b1_dla0_int8.engine
input-dims=3;960;544;0
uff-input-blob-name=input_1
batch-size=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
#int8-calib-file=/mnt/8c3f68c9-a08a-400b-8c80-99c5fee26a06/detectnet_v2_models/detectnet_4K-fddb-12/calibration.bin
num-detected-classes=2
cluster-mode=1
interval=0
gie-unique-id=1
is-classifier=0
classifier-threshold=0.9
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

deepstream_app_source1_camera_masker_gpu.txt

[primary-gie]
enable=1
gpu-id=0
# Modify as necessary
# GPU engine file
model-engine-file=model-48500.etlt_b1_gpu0_fp16.engine
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=0;1;0;1
bbox-border-color1=1;0;0;1
#bbox-border-color2=0;0;1;1 # Blue
#bbox-border-color3=0;1;0;1
gie-unique-id=1
config-file=config_infer_primary_masknet_gpu.txt

Can you help me check it? Thanks!

ak-nv commented 3 years ago

Good to hear your accuracy is close to what we have got.

In config_infer_primary_masknet_gpu.txt

sudhirm4 commented 3 years ago

Hello,

I have similar issue. Unable to detect faces and masks. Here is the model evaluation results

Validation cost: 0.001500 Mean average_precision (in %): 85.0774

class name average precision (in %)


mask 84.7703 no-mask 85.3846

Config file -

[property] gpu-id=0 net-scale-factor=0.0039215697906911373 tlt-model-key=tlt_encode tlt-encoded-model=/opt/nvidia/deepstream/deepstream-5.0/samples/mask-detection/resnet18_detector_unpruned.etlt labelfile-path=labels_masknet.txt

GPU Engine File

model-engine-file=/opt/nvidia/deepstream/deepstream-5.0/samples/mask-detection/resnet18_detector_unpruned.engine

input-dims=3;960;544;0 uff-input-blob-name=input_1 batch-size=1 model-color-format=0

0=FP32, 1=INT8, 2=FP16 mode

network-mode=0

int8-calib-file=/mnt/8c3f68c9-a08a-400b-8c80-99c5fee26a06/detectnet_v2_models/detectnet_4K-fddb-12/calibration.bin

num-detected-classes=2 cluster-mode=1 interval=0 gie-unique-id=1 is-classifier=0 classifier-threshold=0.5 output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-0] pre-cluster-threshold=0.3 group-threshold=1 eps=0.5

minBoxes=1

detected-min-w=0 detected-min-h=0 detected-max-w=0 detected-max-h=0

[class-attrs-1] pre-cluster-threshold=0.3 group-threshold=1 eps=0.3

minBoxes=1

detected-min-w=0 detected-min-h=0 detected-max-w=0 detected-max-h=0

Can you please share model file and sample video for testing?

XiaoPengZong commented 3 years ago

@ak-nv Hi , I change classifier-threshold from 0.1 to 0.9, there is no difference. I have [class-attrs-0] and [class-attrs-1] group with default value in config_infer_primary_masknet_gpu.txt. Any other advice about this problem? Thanks!

ak-nv commented 3 years ago

I usually try these lines and it works. Have you tried camera mode? Unfortunately, i cannot share pre-trained model for face-mask-detection

Good to hear your accuracy is close to what we have got.

In config_infer_primary_masknet_gpu.txt

  • reduce classifier-threshold parameter, in some case I had tried till 0.6
  • add [class-attrs-0] and [class-attrs-1] under this experiment with following three parameters: pre-cluster-threshold , group-threshold and eps. You can find more about these parameters on deepstream dev guide under table titled: Gst-nvinfer plugin, [class-attrs-...] groups, supported keys
sudhirm4 commented 3 years ago

HI,

I got results when tested using unpruned model

!tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_tlt.txt \ -o $USER_EXPERIMENT_DIR/test/output \ -i $USER_EXPERIMENT_DIR/test/images \ -k $KEY However when exported in to etlt model and tested with deepstream it is not working.

Any suggestions ?

Thanks

ak-nv commented 3 years ago

Do you get any errors? Did you try above suggested solution? Are you trying camera or video deepstream config file?

Good to hear your accuracy is close to what we have got.

In config_infer_primary_masknet_gpu.txt

  • reduce classifier-threshold parameter, in some case I had tried till 0.6
  • add [class-attrs-0] and [class-attrs-1] under this experiment with following three parameters: pre-cluster-threshold , group-threshold and eps. You can find more about these parameters on deepstream dev guide under table titled: Gst-nvinfer plugin, [class-attrs-...] groups, supported keys
XiaoPengZong commented 3 years ago

@ak-nv I have tested camera and video which generated from dataset picture, both of them cant detecte object.

sudhirm4 commented 3 years ago

I have made following changes and able to detect masks

[property] gpu-id=0 net-scale-factor=0.0039215697906911373 tlt-model-key=tlt_encode tlt-encoded-model=/opt/nvidia/deepstream/deepstream/samples/mask-detection/resnet18_detector_unpruned.etlt labelfile-path=/opt/nvidia/deepstream/deepstream/samples/mask-detection/labels_masknet.txt input-dims=3;544;960;0 uff-input-blob-name=input_1 batch-size=1 process-mode=1 model-color-format=0

0=FP32, 1=INT8, 2=FP16 mode

network-mode=0 num-detected-classes=3 cluster-mode=1 interval=0 gie-unique-id=1 output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all] pre-cluster-threshold=0.4

Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)

eps=0.7 minBoxes=1

ak-nv commented 3 years ago

@sudhirm4 Good to hear. Can we conclude that, experimentation with [class-attrs-all] and eps=0.7 minBoxes=1 worked?

XiaoPengZong commented 3 years ago

@sudhirm4 @ak-nv Hi, I change my config file same as you, It can detect some mask, but most of them are not detected.

XiaoPengZong commented 3 years ago

hello, any update ?

ak-nv commented 3 years ago

@XiaoPengZong You might want to experiment with above parameters. Also, I am trying to add more detailed steps and examples with sample video soon in upcoming weeks, if that helps.

hosseinzadeh88 commented 3 years ago

I am experiencing the same detection issue when using the trained (unpruned and pruned) model with deepstream. I have trained the model and got about 83% accuracy for both classes. So far I have tried with the unpruned and pruned models in FP16 and FP32 mode. I have also tried increasing the epochs from 120 to 200 and retrained the network more, but no changes. Unless I reduce the detection confidence (aka classifier-threshold) and pre-cluster-threshold to a ridiculously low value (0.01) or 1% it does not detect anything. Unfortunately, at 0.01 it detects almost everything and still manages to miss more than half of the faces! The performance is terrible compared to the purpose-built PeopleNet model (which is also a ResNet18). @ak-nv Could I possibly use the PeopleNet (trained) https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplenet as the pre-trained model to get better detections or that one is probably trained on too many unmasked faces and this won't work? Or is it possible at all to use an etlt model as my pre-trained model and train that further for mask detection purposes? One thing that I kind noticed is that it is better at detecting large faces that are closer to the camera, that's just my initial observation and not too sure about it, didn't try it with a lot of different videos. I'm still running further experiments and trying to figure out why the trained model has such poor detection, I'll report back my findings, if you have any suggestions please let me know.

ak-nv commented 3 years ago

One thing that I kind noticed is that it is better at detecting large faces that are closer to the camera, that's just my initial observation and not too sure about it, didn't try it with a lot of different videos.

The dataset contains most of the images with close faces, so your observation is right. You can add augmentations to improve this. TLT_agumentation

Could I possibly use the PeopleNet (trained) https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplenet as the pre-trained model to get better detections

I have not tried this, but this might be a good way.

ak-nv commented 3 years ago

@XiaoPengZong Can you add maintain-aspect-ratio = 1 in config_infer_primary_masknet_gpu.txt ? and try it. Please let me know.

XiaoPengZong commented 3 years ago

@ak-nv Sorry I'm on holiday in last few days. I have a try but not get promotion.

lakshaychhabra commented 3 years ago

@XiaoPengZong any success?

lakshaychhabra commented 3 years ago

So after spending a day, the conclusion is to change the input dimensions. Make input-dims=3;480;640;0 or even input-dims=3;544;960;0 does the justice. I was getting 12 FPS on 544,960 but Then I switched to 3;300;300 and got 25FPS.

Things to change input-dims=3;300;300 pre-cluster-threshold=0.2 eps=0.3 or eps=0.4 minBoxes=1

marekjg commented 3 years ago

@XiaoPengZong Can you add maintain-aspect-ratio = 1 in config_infer_primary_masknet_gpu.txt ? and try it. Please let me know.

This hint finally made it work in my case, thanks!