chainer / chainercv

ChainerCV: a Library for Deep Learning in Computer Vision
MIT License
1.48k stars 304 forks source link

Inconsistency in (row, col) and (col, row) conventions #86

Closed yuyu2172 closed 7 years ago

yuyu2172 commented 7 years ago

There is inconsistency on conventions used for arguments and return data order of transforms. The conventions refer to (row, col) and (col, row) orders. For example, when a transform takes argument flip_x, flip_y in this order, it is following (col, row) convention.

The issue is that there are functions that follow those conventions, whereas the image shapes follow (row, col) convention.

Currently, the following codes are related to this issue.

yuyu2172 commented 7 years ago

For return values, packing all of them into a dictionary may be good. For example, random_flip returns img, params, where params is a dictionary whose keys are x_flip and y_flip.

This is good for three reasons.

yuyu2172 commented 7 years ago

@Hakuyume The code your PRed will be changed with the proposal above. What do you think about it?

yuyu2172 commented 7 years ago

For arguments, following functions are related

Although there is an inconsistency, using x_*, y_* in this order for arguments is fine. The reasons behind this are

The other alternatives would be to make everything consistent. However, I found them to be less convincing.

The first alternative would be to make everything follow y, x order. However, this would bring up the question of whether to keep order of elements in bounding boxes and keypoints which are currently in x, y order. In scikit-image, everything including bounding boxes are in y, x order. I think that using y, x for bounding boxes and keypoints are very rare conventions.

The second alternative would be to make everything follow x, y order. This means that image shapes will be represented as W, H instead of H, W. I find it increasingly popular to use H, W in many machine learning communities (e.g. TensorFlow), and I think this convention is more natural especially when working with tensors of shape CHW.

Hakuyume commented 7 years ago

I agree with you that inconsistency of the order of axis is a problem. I like (col, row) order, because

yuyu2172 commented 7 years ago

After some discussion internally, I agree that making everything in col, row order seems to be best.

We need to change resize related functions.