lmb-freiburg / flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/
Other
1k stars 318 forks source link

Data augmentation parameter #213

Closed abrosua closed 4 years ago

abrosua commented 4 years ago

Hi, I have a question regarding the augmentation parameters that are being used (e.g., -20% to 20% translation, etc).

I noticed that you didn't explicitly state the augmentation parameters to train FlowNet2 in the paper. Did you use the same parameter as the one that written in the FlowNet paper (written below)?

Specifically we sample translation from the range [−20%,20%] of the image width for x and y; rotation from [−17, 17]; scaling from [0.9, 2.0]. The Gaussian noise has a sigma uniformly sampled from [0, 0.04]; contrast is sampled within [−0.8, 0.4]; multiplicative color changes to the RGB channels per image from [0.5, 2]; gamma values from [0.7,1.5] and additive brightness changes using Gaussian with a sigma of 0.2.

I find it quite hard to understand the caffe's augmentation layer cpp scripts, due to my lack of experiences with caffe. Thanks for your attention.

Regards

nikolausmayer commented 4 years ago

I can't give you an authoritative answer, but let me try to infer from the network definition:

Don't attribute too much to the exact values. If you stick to what's sensible within the task, and for your data, it will work. Our augmentation parameters were just random guesses, too.

abrosua commented 4 years ago

Thanks for the answers!

Just a quick question, the random cropping is performed at the end of the augmentation (after translation, rotation, gaussian noise, contrast, etc), isn't it?

nikolausmayer commented 4 years ago

yes!

abrosua commented 4 years ago

Noted! Thanks for your recommendations!

lelelexxx commented 4 years ago

yes! Thanks for your kind replies! Could you please explain the purpose of "check if all 4 corners of the transformed image fit into the original image" in layers/augmentation_layer_base.cpp, As far as I concern,in this way, we can perform geometry augmentation which won't lead to black conners. But in generate_valid_spatial_coeffs(), you check the [0, cropped_hight],[cropped_width, cropped_hight] ,[0, 0], [cropped_width, 0] these for Conner points, However, at this moment, you didn't generate the random crop area yet. which means we still got the chance to crop a augmented area with black conners. Did I miss something? Expecting for your replies! Thanks again!

nikolausmayer commented 4 years ago

@lelelexxx in `layers/augmentation_layer_base.cpp', lines 146–156, the crop coordinates are transformed into the generated random crop area. The corner points already check whether the final augmented area can be cropped from the original image without black corners.

lelelexxx commented 4 years ago

@lelelexxx in `layers/augmentation_layer_base.cpp', lines 146–156, the crop coordinates are transformed into the generated random crop area. The corner points already check whether the final augmented area can be cropped from the original image without black corners.

Thanks for your reply, But I think I didn't represent myself clearly, apology for my poor English. I have read about lines 146-156, and though these code, the central crop box [0, cropped_hight],[cropped_width, cropped_hight] ,[0, 0], [cropped_width, 0] is transformed using the random generated affine coeff. But in my opinion, I thought we should use the inverse of the affine coefficient to map the central cop box( in transformed image) into original images to check whether the crop box (in transformed image) are out of range of original images. However, this process means central crop not random crop. So here is my questions.

1.Did FlowNet2 using random crop or central crop in transformed images? 2.When check invalid params, why we don't use the inverse of affine coefficient? Did I miss something? Could you please point it out for me?

Thanks a lot, The Augmentation thing is driving my crazy~

nikolausmayer commented 4 years ago

I thought we should use the inverse of the affine coefficient

The augmentation kernel in layers/SpatialAugmentation.cu (line 25) uses backward-warping to generate the augmented crop. It retrieves pixels from the original image using the same computation as the corner-check for crop validity. This might be confusing if you are used to forward-warping (in which case you are right and the transformation would have to be inverted).

The crops are random. layers/augmentation_layer_base.cpp lines 148 and 155 contain the parameters for this.

lelelexxx commented 4 years ago

I thought we should use the inverse of the affine coefficient

The augmentation kernel in layers/SpatialAugmentation.cu (line 25) uses backward-warping to generate the augmented crop. It retrieves pixels from the original image using the same computation as the corner-check for crop validity. This might be confusing if you are used to forward-warping (in which case you are right and the transformation would have to be inverted).

The crops are random. layers/augmentation_layer_base.cpp lines 148 and 155 contain the parameters for this.

Thanks, I have read all issues about Augmentation, and I have to say, you are such a nice guy and always so patient. Sorry, I can't find layers/spatialAugmentation.cu , is this file in this repo? let me go thourgh this process once again, At first we generate the geometry params for augmentation including crop coordinates, then we use backward-warping to check if the generated geometry params is invalid for this crop area, if it is valid, we perform the transform to img1,img2 and flow, at last, the transformed images and flow are cropped according to the crop coordinates which generated before? Am I right?

Thanks again.

nikolausmayer commented 4 years ago

Eh right, that was another codebase, my bad. it should be in layers/data_augmentation_layer.cu, line 25.

That sounds right, yes, except that the "cropping" at the end is not a separate step (i.e. not "transform then crop")—rather, the final augmented patch is directly sampled from the original image. The validity-check and the actual crop generation should be the same process, just for some reason the former uses explicit computations while the latter wraps the affine transformation into a 2×3 matrix transMat.