Open luoshiyong opened 5 months ago
ref ptq svg of engine.
ref ptq svg of engine.
By the time you see the graph above, I've already referenced svg. My problem is how to efficiently find which layers are faster when quantised, rather than taking my time to go through the graphs
in the process of yolov8 int8 quant, i find that some layers(int8) is slower than fp16, and the reformat operation is very time-consuming, for best presion, we can do sensitive-layer analysise to get the proper layer to quant , but for best speed, how should i du to identify which layer to quant? (some screenshot blow)