Closed dexception closed 3 years ago
+1
@dexception data
is a big batch with shape (N, C, H, W) and type of numpy array (float32) .
for example, we plan to use 100 images to do calibration, you can follow the steps below:
data
.
Recently, I will add this example to the repo.@YonghaoHe I will wait for your example.
Would love to see the stats related to drop in accuracy and improvements in speed in int8 mode vs fp16.
@dexception @ashuezy I have implemented INT8 inference, you can check timing_inference_latency.py and predict_tensorrt.py. Also, in README.md, I have updated the INT8 inference latency.
@YonghaoHe Very good accuracy with int8 implementation. I ran the timing_inference on 2080Ti for XS and following are the results with 6GB memory.
fp32 : 329 FPS fp16: 449 FPS int8: 480 FPS
@dexception 😄
I have made the following changes in the file lfd/deployment/tensorrt/build_engine.py
assert int8_calibrator is not None, 'calibrator is not provided!'
I don't understand the data parameter. Can you please help with passing images in this data attribute?