Yolo and Tiny-Yolo are reference models and Compressed Tiny-Yolo (remove 12th and 13th layers) and Tiny-Darknet (based on SqueezeNet) are trained from scratch. One of my observations is (note that processing times are measured on a phone) although Tiny-Darknet can reduce the weight file size, it cannot translate into a significant reduction in processing (inference) time.
I am looking into other techniques such as quantization. Anyone has tried something similar before?
Yolo and Tiny-Yolo are reference models and Compressed Tiny-Yolo (remove 12th and 13th layers) and Tiny-Darknet (based on SqueezeNet) are trained from scratch. One of my observations is (note that processing times are measured on a phone) although Tiny-Darknet can reduce the weight file size, it cannot translate into a significant reduction in processing (inference) time.
I am looking into other techniques such as quantization. Anyone has tried something similar before?