apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

The same model with different weights has different inference speeds. #18901

Closed ikeliou closed 4 years ago

ikeliou commented 4 years ago

The cpu inference speed of the model LResNet100E-IR which I trained is very slow then the pretrained model in the insightface's model zoo.

my model: https://drive.google.com/drive/folders/15fRNmKKY-Hoz0L7oayD8DXA-DbGtaZM1?usp=sharing

insignface model: https://www.dropbox.com/s/tj96fsm6t6rq8ye/model-r100-arcface-ms1m-refine-v2.zip?dl=0

mxnet Version 1.6.0

code:

initialization:
void face_feature_init(std::string modelFolder) {
  std::string json_file = "models/models-symbol.json";
  std::string param_file = "models/models-0000.params";

  BufferFile json_data(json_file);
  BufferFile param_data(param_file);

  // Parameters
  int dev_type = 1;  // 1: cpu, 2: gpu
  int dev_id = 0;  // arbitrary.
  mx_uint num_input_nodes = 1;  // 1 for feedforward
  const char* input_key[1] = { "data" };
  const char** input_keys = input_key;

  const mx_uint input_shape_indptr[2] = { 0, 4 };
  const mx_uint input_shape_data[4] = { 1,
                                        static_cast<mx_uint>(channels),
                                        static_cast<mx_uint>(height),
                                        static_cast<mx_uint>(width) };

  if (json_data.GetLength() == 0 || param_data.GetLength() == 0) {
    return;
  }

  MXPredCreate(static_cast<const char*>(json_data.GetBuffer()),
          static_cast<const char*>(param_data.GetBuffer()),
          static_cast<int>(param_data.GetLength()),
          dev_type,
          dev_id,
          num_input_nodes,
          input_keys,
          input_shape_indptr,
          input_shape_data,
          &pred_hnd);
  assert(pred_hnd);
}

inference:
void face_feature_getfeature(const cv::Mat &img, cv::Mat &feature) {
#ifdef debugtime
  PerformanceTimer pt("face_feature_getfeature");
#endif
  auto image_size = static_cast<std::size_t>(width * height * channels);
  std::vector<mx_float> image_data(image_size);

  GetImageFile(img, image_data.data(), channels, cv::Size(width, height));

  // Set Input Image
  MXPredSetInput(pred_hnd, "data", image_data.data(), static_cast<mx_uint>(image_size));

  // Do Predict Forward
  MXPredForward(pred_hnd);

  mx_uint output_index = 0;

  mx_uint* shape = nullptr;
  mx_uint shape_len;

  // Get Output Result
  MXPredGetOutputShape(pred_hnd, output_index, &shape, &shape_len);

  std::size_t size = 1;
  for (mx_uint i = 0; i < shape_len; ++i) { size *= shape[i]; }

  std::vector<float> data(size);

  MXPredGetOutput(pred_hnd, output_index, &(data[0]), static_cast<mx_uint>(size));
#ifdef debug
  PrintOutputResult(data);
#endif
  cv::Mat vector(size, 1, CV_32F);
  memcpy(vector.data, data.data(), size*4);
  cv::Mat _l2;
  cv::multiply(vector,vector,_l2);
  float l2 =  cv::sqrt(cv::sum(_l2).val[0]);
  vector = vector / l2;
  feature = vector;
}
github-actions[bot] commented 4 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

ikeliou commented 4 years ago

I found the problem is the same as the issue #17953.