Closed MohamedAtef321 closed 1 year ago
@MohamedAtef321 there's no bug here, just a mistaken assumption that TFLite models should be faster than PyTorch models on CPU. If you want speed ideas I'd suggest you just take a look at the daily export benchmarks. https://github.com/ultralytics/ultralytics/actions/runs/4338716714/jobs/7575625589
@MohamedAtef321 there's no bug here, just a mistaken assumption that TFLite models should be faster than PyTorch models on CPU. If you want speed ideas I'd suggest you just take a look at the daily export benchmarks. https://github.com/ultralytics/ultralytics/actions/runs/4338716714/jobs/7575625589
But in the benchmark here the TensorFlow Lite inference time is less than PyTorch one!
Any reasons for that!? š¤·āāļø
Hello @glenn-jocher,
I observe the same issue as @MohamedAtef321 except the gap is even greater. The tflite version is 8 times slower than the original pytorch model on several of my computers with different OS (Mac, Ubuntu). I tested on the yolov8n-seg model with the default imgsz=640
I checked the latest benchmark : https://github.com/ultralytics/ultralytics/actions/runs/5249812787/jobs/9488947907 It suggests that tflite can be faster than the pytorch version on CPU (or at least not that far away) but I cannot reproduce similar results at all.
Computer specs :
Python 3.10.11, Ubuntu 20.04, CPU : Ryzen 9 5900X, ultralytics version : 8.0.117
With yolov8n-seg:
YOLO('yolov8n-seg')('https://ultralytics.com/images/bus.jpg', device="cpu")
Found https://ultralytics.com/images/bus.jpg locally at bus.jpg image 1/1 /home/user/git/ultralytics/bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, 38.6ms Speed: 1.4ms preprocess, 38.6ms inference, 3.0ms postprocess per image at shape (1, 3, 640, 640)
YOLO('yolov8n-seg').export(format='tflite')
):
YOLO('yolov8n-seg_saved_model/yolov8n-seg_float32.tflite')('https://ultralytics.com/images/bus.jpg', device="cpu")
/home/user/anaconda3/envs/ultralytics/bin/python /home/user/git/ultralytics/predict.py Loading /home/user/git/ultralytics/yolov8m-seg_saved_model/yolov8m-seg_float32.tflite for TensorFlow Lite inference... INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Found https://ultralytics.com/images/bus.jpg locally at bus.jpg image 1/1 /home/user/git/ultralytics/bus.jpg: 640x640 4 persons, 1 bus, 1 skateboard, 306.0ms Speed: 1.6ms preprocess, 306.0ms inference, 3.8ms postprocess per image at shape (1, 3, 640, 640)
Do you have an idea on why it is happening ? How can it be fixed/improved ?
Thanks.
Search before asking
YOLOv8 Component
Detection
Bug
I worked on Google Colab and tried to run yolov8n trained on custom data, after exporting model to
.tflite
type, I tried to run model with types of PyTorch and TensorFlow Lite (float 16 and 32).I expected tensorflow lite model to be faster than the pytorch one, but the result was a surprise to me. PyTorch model has inference time of 230 ms, but TensorFlow Lite model has almost 410 ms with both (float 16 and 32).
I have detailed it all into a pdf file š in this drive link : https://drive.google.com/file/d/1sGayaf3E5YAR1dZlKXp2T_eHPmb840M4/view?usp=sharing
Could you explain how this happens? and is there any way to enhance the performance or FPS of yolov8n model using TensorFlow Lite (or any other software method)?
Any advice will be appreciated. š
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?