NVIDIA-AI-IOT / deepstream_tao_apps

Sample apps to demonstrate how to deploy models trained with TAO on DeepStream
MIT License
376 stars 95 forks source link

Error: Failure in generating int8 calibration YOLOv3 with TLT #4

Open Dewei36 opened 4 years ago

Dewei36 commented 4 years ago

I am running the notebook YOLO example of TLT Version 2 release in the tlt docker. I have successfully trained, pruned, and retrained the YOLO-restnet18. The results of FP32’s are fine but when I executed the tlt-export (int8) command as shown in the notebook:

!tlt-export yolo -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolo_resnet18_epoch_$EPOCH.tlt \ -o $USER_EXPERIMENT_DIR/FFxport/yolo_resnet18_8epoch_$EPOCH.etlt \ -e $SPECS_DIR/yolo_retrain_resnet18_kitti.txt \ -k $KEY \ --cal_image_dir $USER_EXPERIMENT_DIR/data/training/image_2 \ --data_type int8 \ --batch_size 1 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/FFxport/cal.bin \ --cal_data_file $USER_EXPERIMENT_DIR/FFxport/cal.tensorfile

The warning message popped up:

Using TensorFlow backend. 2020-05-12 02:09:02,221 [INFO] /usr/local/lib/python2.7/dist-packages/iva/yolo/utils/spec_loader.pyc: Merging specification from /workspace/examples/yolo/specs/yolo_retrain_resnet18_kitti.txt 2020-05-12 02:09:05,844 [INFO] /usr/local/lib/python2.7/dist-packages/iva/yolo/utils/spec_loader.pyc: Merging specification from /workspace/examples/yolo/specs/yolo_retrain_resnet18_kitti.txt NOTE: UFF has been tested with TensorFlow 1.14.0. WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF. Warning: No conversion function registered for layer: BatchedNMS_TRT yet. Converting BatchedNMS as custom op: BatchedNMS_TRT Warning: No conversion function registered for layer: ResizeNearest_TRT yet. Converting upsample1/ResizeNearestNeighbor as custom op: ResizeNearest_TRT Warning: No conversion function registered for layer: ResizeNearest_TRT yet. Converting upsample0/ResizeNearestNeighbor as custom op: ResizeNearest_TRT Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet. Converting FirstDimTile_2 as custom op: BatchTilePlugin_TRT Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet. Converting FirstDimTile_1 as custom op: BatchTilePlugin_TRT Warning: No conversion function registered for layer: BatchTilePlugin_TRT yet. Converting FirstDimTile_0 as custom op: BatchTilePlugin_TRT DEBUG [/usr/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘BatchedNMS’] as outputs 2020-05-12 02:09:22,062 [WARNING] modulus.export._tensorrt: Calibration file /workspace/tlt-experiments/yolo/export/cal.bin exists but is being ignored. [TensorRT] INFO: Detected 1 inputs and 4 output network tensors. [TensorRT] WARNING: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles [TensorRT] INFO: Starting Calibration with batch size 1. DEPRECATED: This variant of get_batch is deprecated. Please use the single argument variant described in the documentation instead. [TensorRT] INFO: Calibrated batch 0 in 0.101843 seconds. [TensorRT] INFO: Calibrated batch 1 in 0.0926203 seconds. [TensorRT] INFO: Calibrated batch 2 in 0.0919941 seconds. [TensorRT] INFO: Calibrated batch 3 in 0.0910869 seconds. [TensorRT] INFO: Calibrated batch 4 in 0.0929871 seconds. [TensorRT] INFO: Calibrated batch 5 in 0.0934323 seconds. [TensorRT] INFO: Calibrated batch 6 in 0.099967 seconds. [TensorRT] INFO: Calibrated batch 7 in 0.104515 seconds. [TensorRT] INFO: Calibrated batch 8 in 0.0996476 seconds. [TensorRT] INFO: Calibrated batch 9 in 0.0921785 seconds. [TensorRT] WARNING: Tensor BatchedNMS is uniformly zero; network calibration failed. [TensorRT] WARNING: Tensor BatchedNMS_1 is uniformly zero; network calibration failed. [TensorRT] WARNING: Tensor BatchedNMS_2 is uniformly zero; network calibration failed. [TensorRT] INFO: Post Processing Calibration data in 3.91784 seconds. [TensorRT] INFO: Calibration completed in 37.7898 seconds. 2020-05-12 02:09:59,897 [WARNING] modulus.export._tensorrt: Calibration file /workspace/tlt-experiments/yolo/export/cal.bin exists but is being ignored. [TensorRT] INFO: Writing Calibration Cache for calibrator: TRT-7000-EntropyCalibration2 2020-05-12 02:09:59,897 [INFO] modulus.export._tensorrt: Saving calibration cache (size 9237) to /workspace/tlt-experiments/yolo/export/cal.bin [TensorRT] WARNING: Rejecting int8 implementation of layer BatchedNMS due to missing int8 scales, will choose a non-int8 implementation. [TensorRT] INFO: Detected 1 inputs and 4 output network tensors

It looks like I did not make it in generating an int8 implementation for calibration but I am not sure what is going on with the warnings and any solution to it. I google-searched several forums but noting related to TLT version 2.

I appreciate your help! Thanks!

mchi-zg commented 4 years ago

[INFO] modulus.export._tensorrt: Saving calibration cache (size 9237) to /workspace/tlt-experiments/yolo/export/cal.bin

According to above log, I think the INT8 calibration should be generated

morganh-nv commented 4 years ago

Please refer to https://forums.developer.nvidia.com/t/errors-tlt-export-tlt-yolo-model-to-int8-calibration/122787