BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

How is caffe resizing my images for training in transform_param #6822

Open AbhimanyuAryan opened 5 years ago

AbhimanyuAryan commented 5 years ago

So, I want to understand is caffe losing the detail of my high-resolution image while resizing to 300x300?

There could be two cases while resizing the image to 300x300 my 1200x990 image: 1) Cropping 2) Squishing

In the first case i.e. cropping it's loosing details of my actual labelled image

In the second case i.e. Squishing my original high-resolution image is squished to a small size which also means it's a waste to pass high-resolution image

Now, I saw the source and train.prototxt

transform_param {
    resize_param {
          resize_mode: WARP
          height: 300
          width: 300
    }
}

then I saw CPP code for wrap & found

Link to cpp code here

void UpdateBBoxByResizePolicy(const ResizeParameter& param,
                              const int old_width, const int old_height,
                              NormalizedBBox* bbox) {
  float new_height = param.height();
  float new_width = param.width();
  float orig_aspect = static_cast<float>(old_width) / old_height;
  float new_aspect = new_width / new_height;

  float x_min = bbox->xmin() * old_width;
  float y_min = bbox->ymin() * old_height;
  float x_max = bbox->xmax() * old_width;
  float y_max = bbox->ymax() * old_height;
.....
.....

and the actual logic

  switch (param.resize_mode()) {
    case ResizeParameter_Resize_mode_WARP:
      x_min = std::max(0.f, x_min * new_width / old_width);
      x_max = std::min(new_width, x_max * new_width / old_width);
      y_min = std::max(0.f, y_min * new_height / old_height);
      y_max = std::min(new_height, y_max * new_height / old_height);
      break;

now I understand Cplusplus but what I don't understand is what is bbox? and what's the logic for WRAP mode?

Can someone please explain to me what's happening with my 1200x900 image. Thanks