Open guangy10 opened 5 days ago
cc: @cbilgin @kimishpatel @shoumikhin
Something weird is going on, the numbers are too high. Investigating.
Here is the profiling of mv3
running on the executor_runner binary:
...
โ 186 โ Execute โ DELEGATE_CALL โ 86.8285 โ 86.8285 โ 86.8285 โ 86.8285 โ 86.8285 โ 86.8285 โ ['aten.convolution.default', 'aten._native_batch_norm_legit_no_training.default' ... 'aten.addmm.default'] (317 total) โ False โ CoreMLBackend โ nan โ nan โ nan โ nan โ
โโโโโโโผโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 187 โ Execute โ Method::execute โ 86.8298 โ 86.8298 โ 86.8298 โ 86.8298 โ 86.8298 โ 86.8298 โ [] โ False โ โ nan โ nan โ nan โ nan โ
โโโโโโโงโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The avg of Method::execute
is 86.829ms, far different from the numbers get from the benchmark app. @shoumikhin Is it same as how test_forward is measured in the app?
Did you see any graph breaks?
๐ Describe the bug
We are running exercise on the newly launched benchmarking infra with the in-tree enabled models under
examples/models
. You can follow the instruction to extract and inspect thebenchmark_results.json
. (Connecting the results to a dashboard is still WIP)The preliminary data points collected from the benchmarking infra can be found here: https://github.com/pytorch/executorch/tree/main/extension/benchmark#preliminary-benchmark-results. It shows some previously enabled models, e.g. MobileNetV2/3, InceptionV3/4, are running expected slower with Core ML delegates.
The models are exported using the coreml export scripts (code pointer), and the app that loads and runs the exported model is located under
extension/apple/benchmark
.Versions
From latest executorch main