NVIDIA-AI-IOT / face-mask-detection

Face Mask Detection using NVIDIA Transfer Learning Toolkit (TLT) and DeepStream for COVID-19
MIT License
241 stars 94 forks source link

TLT Inputs and outputs #6

Closed hectormdom closed 3 years ago

hectormdom commented 3 years ago

Between what I've read, what I've watched, and what's in the jupyter notebook I'm a bit confused regarding the inputs and outputs, I would like to run through you what I think it's correct. I modified this with respect from the original jupyter notebook because I think the default jupyter notebook is doing something like working partially with both unpruned and pruned models, but it doesn't have all the steps to be able to run both. I am doing it so I can run the pruned model in the jetson nano.

[In the x86 system]

  1. Training;
    --input pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18/resnet18.hdf5
    --output experiment_dir_unpruned/weights/resnet18_detector.tlt
  2. Pruning
    --input experiment_dir_unpruned/weights/resnet18_detector.tlt
    --output experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt
  3. Retrain
    --input experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt
    --output experiment_dir_retrain/resnet18_detector_pruned.tlt
  4. Deploy
    --input experiment_dir_retrain/weights/resnet18_detector_pruned.tlt
    --output experiment_dir_final/resnet18_detector_thermal.etlt

    (There's like a second step in the notebook which I think was for benchmarking between the pruned and the unpruned model that I think it's safe to ignore if I am just trying to get the pruned model to run in the jetson nano)

-9A. Int8 Optimisation

--input experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt
--output experiment_dir_final/calibration.tensor

-second step

--input experiment_dir_retrain/weights/resnet18_detector_pruned.tlt
--input experiment_dir_final/calibration.tensor
--output experiment_dir_final/resnet18_detector.etlt
--output experiment_dir_final/resnet18_detector.trt.int8
--output experiment_dir_final/calibration.bin

[In the jetson nano] -9B. Generate TensorRT engine, using the downloaded tlt-converter

--input experiment_dir_final/resnet18_detector.etlt
--input experiment_dir_final/calibration.bin
--output experiment_dir_final/resnet18_detector.trt

Please let me know if I'm doing anything wrong, cheers!

ak-nv commented 3 years ago

All steps till 9 are looking right. For Jetson Nano, it is not supported for int8 mode currently thus, avoiding int8 optimization step. For 10B . you need to change tlt-converter option -t fp16

hectormdom commented 3 years ago

Hi, thanks for your quick reply, so I should skip 9A entirely? (sorry I had a typo when I typed it first, it wasn't 10A,10B) thus not generating any calibration files. If I go this way, the only file I would have to transfer to the nano would be first output of step 9 (--output experiment_dir_final/resnet18_detector_thermal.etlt) In order to proceed with 9B should I just remove the -c flag? or would such argument be substituted with something else?

!tlt-converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
               -k $KEY \
               -c $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
               -o output_cov/Sigmoid,output_bbox/BiasAdd \
               -d 3,544,960 \
               -i nchw \
               -m 64 \
               -t fp16 \
               -e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \
               -b 4
ak-nv commented 3 years ago

the only file I would have to transfer to the nano would be first output of step 9

Sounds right.

In order to proceed with 9B should I just remove the -c flag? or would such argument be substituted with something else?

Yes. Calibration is performed in TensorRT to reduce loss of accuracy; you can remove -c

ak-nv commented 3 years ago

Closing as no update since 10 days, please re-open if any question on similar lines.