yuenshome / yuenshome.github.io

https://yuenshome.github.io
MIT License
84 stars 15 forks source link

图像预处理 #5

Open ysh329 opened 5 years ago

ysh329 commented 5 years ago

列出caffe、darknet框架常用的图像几种预处理过程。

ysh329 commented 5 years ago

Caffe的图像预处理

这段代码位于caffe/examples/cpp_classification/classification.cpp中:

void Classifier::Preprocess(const cv::Mat& img,
                            std::vector<cv::Mat>* input_channels) {
  /* Convert the input image to the input image format of the network. */
  cv::Mat sample;
  if (img.channels() == 3 && num_channels_ == 1)
    cv::cvtColor(img, sample, cv::COLOR_BGR2GRAY);
  else if (img.channels() == 4 && num_channels_ == 1)
    cv::cvtColor(img, sample, cv::COLOR_BGRA2GRAY);
  else if (img.channels() == 4 && num_channels_ == 3)
    cv::cvtColor(img, sample, cv::COLOR_BGRA2BGR);
  else if (img.channels() == 1 && num_channels_ == 3)
    cv::cvtColor(img, sample, cv::COLOR_GRAY2BGR);
  else
    sample = img;

  cv::Mat sample_resized;
  if (sample.size() != input_geometry_)
    cv::resize(sample, sample_resized, input_geometry_);
  else
    sample_resized = sample;

  cv::Mat sample_float;
  if (num_channels_ == 3)
    sample_resized.convertTo(sample_float, CV_32FC3);
  else
    sample_resized.convertTo(sample_float, CV_32FC1);

  cv::Mat sample_normalized;
  cv::subtract(sample_float, mean_, sample_normalized);

  /* This operation will write the separate BGR planes directly to the
   * input layer of the network because it is wrapped by the cv::Mat
   * objects in input_channels. */
  cv::split(sample_normalized, *input_channels);

  CHECK(reinterpret_cast<float*>(input_channels->at(0).data)
        == net_->input_blobs()[0]->cpu_data())
    << "Input channels are not wrapping the input layer of the network.";
}
ysh329 commented 5 years ago

DarkNet的图像预处理

I found a series of pre-processing in darknet. According its process order, I list as below:

  1. read image process using OpenCV or self-implementation (assume OpenCV)
  2. convert image to float type, meanwhile normalize value (each pixel value is divided by 255), save from HWC to CHW
  3. change RGB to BGR
  4. resize image with short side (equal scaling scale)
  5. create a new blank image with model input shape and fill with 0.5, then embed resized image (step4) using equal-scaling-scaled to new image