yuenshome / yuenshome.github.io

https://yuenshome.github.io
MIT License
81 stars 15 forks source link

mindspore lite perfProfiling和perfEvent #133

Open ysh329 opened 3 years ago

ysh329 commented 3 years ago

https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/benchmark_tool.html

./benchmark [--modelFile=<MODELFILE>] [--accuracyThreshold=<ACCURACYTHRESHOLD>]
   [--benchmarkDataFile=<BENCHMARKDATAFILE>] [--benchmarkDataType=<BENCHMARKDATATYPE>]
   [--cpuBindMode=<CPUBINDMODE>] [--device=<DEVICE>] [--help]
   [--inDataFile=<INDATAFILE>] [--loopCount=<LOOPCOUNT>]
   [--numThreads=<NUMTHREADS>] [--warmUpLoopCount=<WARMUPLOOPCOUNT>]
   [--enableFp16=<ENABLEFP16>] [--timeProfiling=<TIMEPROFILING>]
   [--inputShapes=<INPUTSHAPES>] [--perfProfiling=<PERFPROFILING>]
            [--perfEvent=<PERFEVENT>]
参数名 属性 功能描述 参数类型 默认值 取值范围
--perfProfiling= 可选 CPU性能验证时生效,指定是否使用PerfProfiler打印每个算子的CPU性能,当timeProfiling为true时无效。目前仅支持aarch64 CPU。 Boolean false true, false
--perfEvent= 可选 CPU性能验证时生效,指定PerfProfiler打印的CPU性能参数的具体内容,指定为CYCLE时,会打印算子的CPU周期数和指令条数;指定为CACHE时,会打印算子的缓存读取次数和缓存未命中次数;指定为STALL时,会打印CPU前端等待周期数和后端等待周期数。 String CYCLE CYCLE/CACHE/STALL
ysh329 commented 3 years ago

perf_event

#ifdef ENABLE_ARM64
#include <linux/perf_event.h>
#include <sys/ioctl.h>
#include <asm/unistd.h>
#include <unistd.h>
#endif

int Benchmark::MarkPerformance() {
  // ...........
  // ...........

  if (flags_->time_profiling_) {
    const std::vector<std::string> per_op_name = {"opName", "avg(ms)", "percent", "calledTimes", "opTotalTime"};
    const std::vector<std::string> per_op_type = {"opType", "avg(ms)", "percent", "calledTimes", "opTotalTime"};
    PrintResult(per_op_name, op_times_by_name_);
    PrintResult(per_op_type, op_times_by_type_);
#ifdef ENABLE_ARM64
  } else if (flags_->perf_profiling_) {
    if (flags_->perf_event_ == "CACHE") {
      const std::vector<std::string> per_op_name = {"opName", "cache ref(k)", "cache ref(%)", "miss(k)", "miss(%)"};
      const std::vector<std::string> per_op_type = {"opType", "cache ref(k)", "cache ref(%)", "miss(k)", "miss(%)"};
      PrintPerfResult(per_op_name, op_perf_by_name_);
      PrintPerfResult(per_op_type, op_perf_by_type_);
    } else if (flags_->perf_event_ == "STALL") {
      const std::vector<std::string> per_op_name = {"opName", "frontend(k)", "frontend(%)", "backendend(k)",
                                                    "backendend(%)"};
      const std::vector<std::string> per_op_type = {"opType", "frontend(k)", "frontend(%)", "backendend(k)",
                                                    "backendend(%)"};
      PrintPerfResult(per_op_name, op_perf_by_name_);
      PrintPerfResult(per_op_type, op_perf_by_type_);
    } else {
      const std::vector<std::string> per_op_name = {"opName", "cycles(k)", "cycles(%)", "ins(k)", "ins(%)"};
      const std::vector<std::string> per_op_type = {"opType", "cycles(k)", "cycles(%)", "ins(k)", "ins(%)"};
      PrintPerfResult(per_op_name, op_perf_by_name_);
      PrintPerfResult(per_op_type, op_perf_by_type_);
    }
#endif
  }
  // ...........
ysh329 commented 3 years ago

https://github.com/mindspore-ai/mindspore/blob/6f81d28a88ad832f21bd5adee0dd97453d301d12/mindspore/lite/tools/benchmark/benchmark.cc#L760-L765