Open ambitious-octopus opened 1 month ago
Hi @ambitious-octopus,
The reason we have this modification (tuple instead single Tensor), is due to the different values range of those 2 parts of the original Tensor. y_bb (bounding boxes coordinates 4x8400) values range in [0, 640] and y_cls (score per class - 80x8400) range in [0,1].
This fact makes the quantization of this single tensor very problematic, resulting in bad accuracy of the quantized model. (for one part we need high resolution in the range of [0,1] while for the other part we need to stretch the values up to 640). The solution we suggested is therefore to keep the 2 parts separately during MCT quantization.
Let me know if you need more detailed explanation. Thanks Idan
@Idan-BenAmi Hi! Thanks for the explanation! However modifying the original tensor output to a tuple in ultralytics package would break all our current inference pipelines. Is it possible to split the original tensor to a tuple in model_optimization repo before actual starting the MCT quantization? Thanks
Hi @Laughing-q and @ambitious-octopus , I missed your last question, sorry for the delay. The MCT isn't designed to manage this kind of manipulation. What do you think about keeping the split operation within the export code?
The output of the
forward
Method of the Detection Head needs to be atorch.Tensor
, instead of atuple
. This would facilitate integration with our original YOLOv8 model. Would it be possible to modify themct
pipeline to accept atorch.Tensor
with shape(B, 84, 8400)
instead of a tuple within splity_bb
with shape(B, 8400, 4)
andy_cls
with shape(B, 840, 80)
?Sony Implementation:
Original YOLOv8 implementation: