Open Thilanka97 opened 5 years ago
@Thilanka97 Hi, This is just a coincidence :)
[region] and [yolo] layers are very sensitive to precision, so we use FP32
@AlexeyAB do you think that the accuracy can be improved if we use more images during the calibration process?
Also why did you decide to convert the outputs of each layer to 32FP before inputting to the next layer? Is it because converting a INT8 output directly to another INT8(with another multiplier) is not easy to achieve? Also is there any difference in your implementation from the NVIDIA implementation?
Thanks in advance!
@Thilanka97
do you think that the accuracy can be improved if we use more images during the calibration process?
In most cases, no.
Also why did you decide to convert the outputs of each layer to 32FP before inputting to the next layer?
For using other FP32 layers out of box.
Is it because converting a INT8 output directly to another INT8(with another multiplier) is not easy to achieve?
No.
Also is there any difference in your implementation from the NVIDIA implementation?
I didn't saw source code of nVidia implementation, is there open source code of Tensor RT?
@AlexeyAB Thank you so much for the reply.
For using other FP32 layers out of box.
what do you mean ? do you mean in yolov3 shortcut layers? if I want to directly convert int8 to int8 w/o the 32fp conversion after each layer before the next layer, would that be possible? ( I am working with tiny yolov2)
I didn't saw source code of nVidia implementation, is there open source code of Tensor RT?
https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/
only this.
what do you mean ? do you mean in yolov3 shortcut layers?
any layers that isn't yet implemented for INT8: shortcut, yolo, upsample, ... and any new layers
if I want to directly convert int8 to int8 w/o the 32fp conversion after each layer before the next layer, would that be possible? ( I am working with tiny yolov2)
Yes, you should implement it by yourself in the source code
@AlexeyAB what is the R_MULT value? and why 32 ?
#define RMULT (32) // 4 - 32_
Also you are using 32FP for maxpooling as well right? Is there a specific reason or you use 32FP because you use 32FP in-between conv layers?
Thanks in advance!
@AlexeyAB I tried to make it work for direct int8 to int8 conversion without converting to 32FP in the middle. Code does not give any errors but it does not show any predictions(detection boxes) on the image(output). And also it does not show any class probabilities in the output terminal. Do you have any idea why this could happen? Please help me.
Thanks in advance!
@AlexeyAB Hey, I have a small question. why do you use INT8 only for conv layers which use leaky activations, why not for the linear activation layers? Is there a specific reason or is this the case which gives the best accuracy ? Also you do not use INT8 for the regional layer right ?
Thanks in advance!