Closed chenchr closed 6 years ago
@ClementPinard thanks for you reply. It seems that the image and flow is not processed with mean subtraction however in caffe source code of flownet2 do.
This code is trying to replicate FlowNet1 results, which did not use Image normalization. I may had a normalization option later on, but caffe pretrained networks clearly expect a [0,1] input and output raw optical flow (divided by 20).
@ClementPinard Thanks for you reply! I read the code dispflownet-release from https://lmb.informatik.uni-freiburg.de/resources/software.php, in the model flownets, in DataAugmentation layer, there is a parameter named recompute_mean. the layer below:
layer {
name: "img1s_aug"
type: "DataAugmentation"
bottom: "img1s"
top: "img1_nomean"
augmentation_param {
augment_during_test: true
recompute_mean: 1000
mean_per_pixel: false
crop_width: $TARGET_WIDTH
crop_height: $TARGET_HEIGHT
}
}
In the implementation code I think this is to compute the mean in 1000 forward runs and everytime subtract it. Maybe I am wrong.. Can you tell me where you get the caffe source code of flownet? thanks!
Thanks for looking into the FlowNetS code ! :) I took muy weights from the FlownetS caffe model of the v1.0, but I doubt it changed with v1.2
So as far as I understood, data augmentation layers are custom made by freiburg, and the code can be found here : https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cpp (initialization code) and here https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu (forward code)
another intersting thing to look at are the layer parameters on the train graph :
layer {
name: "img0s_aug"
type: "DataAugmentation"
bottom: "blob4"
top: "img0_aug"
top: "blob7"
propagate_down: false
augmentation_param {
max_multiplier: 1
augment_during_test: false
recompute_mean: 1000
mean_per_pixel: false
translate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
zoom {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.2
spread: 0.4
prob: 1.0
}
squeeze {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.3
prob: 1.0
}
lmult_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.2
spread: 0.4
prob: 1.0
}
lmult_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
lmult_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
sat_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
sat_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.3
spread: 0.5
prob: 1.0
}
sat_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
col_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
col_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.2
prob: 1.0
}
col_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.02
prob: 1.0
}
ladd_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
ladd_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
ladd_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.04
prob: 1.0
}
col_rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 1
prob: 1.0
}
crop_width: 448
crop_height: 320
chromatic_eigvec: 0.51
chromatic_eigvec: 0.56
chromatic_eigvec: 0.65
chromatic_eigvec: 0.79
chromatic_eigvec: 0.01
chromatic_eigvec: -0.62
chromatic_eigvec: 0.35
chromatic_eigvec: -0.83
chromatic_eigvec: 0.44
noise {
rand_type: "uniform_bernoulli"
exp: false
mean: 0.03
spread: 0.03
prob: 1.0
}
}
}
As you can see, all augmentation are shut down for test layer (which makes sense), bust mostly apparently recompute mean is used as parameters for further augmentation, I believe it is to update eigen values specified at the end of the training layer (further investigation is needed though), so the test layer will do nothing appart from computing useless mean values.
Okye, I just rechecked the code, and apparently there is indeed mean substraction : https://github.com/lmb-freiburg/flownet2/blob/master/src/caffe/layers/data_augmentation_layer.cu#L593
I'm going to check how it's done on version 1.0 and maybe add it to my code. Thanks for the help !
from flownet1.0 model definition :
layer {
name: "Mean1"
type: "Mean"
bottom: "img0"
top: "img0_aug"
mean_param {
operation: SUBTRACT
input_scale: 0.0039216
value: 0.411451
value: 0.432060
value: 0.450141
}
Hmmmmm well it's pretty clear that I was wrong since the beginning :smile: I'm going to try these mean values to substract and see where it goes !
Thanks for your elaborate reply! I think we can calculate the mean offline just like vgg-net. We can just calculate the image channel mean from dataset's image and fix it and then pass to the normalize function.. I am not sure of what the recompute_mean about to do and why the origin author not to pre-calculate it. Maybe for real world data just like realtime vedio, the mean may differ in different environment so the author decided to calaulate it adaptively..
I think the substract values in flownet0.1 code are mean values of the flying chairs dataset. The moving mean for further version is to adapt to other datasets such as KITTI or mpi.
For the moment you can simply add a new transforms.Normalize(mean=[0.411451,0.432060,0.450141], std=[1,1,1])
right after the first one (which was [0,0,0], [255,255,255] in the Compose
function
Thank you very much!
@ClementPinard Hi ClementPinard. I am using your pytorch implementation these day. However I found that the end point error(EPE) is somewhat wried.. During training. The test epe is larger than train epe. Does this situation occur to you?
It's indeed weird, it only occurred to me when I completely shut down data augmentation. Otherwise every hyperparameter tested made the train perform worse than test. However I mainly tested it on flying chairs
What dataset did you use?
I am using the flyingchairs, flownets without bn. I set a larger batch as 32 than the default setup.
Besides, I want to ask how long did you take to train the flownet on flyingchairs from scratch roughly. I have titanx however I found during training, the gpu utilization is jump between 0% and 100%. I guess the training bottleneck is not the computing capacity but the dataloader doing on cpu instead. To date it take me 23 hours to arrive epoch 102..
You have to set a high number of threads to be able to keep up with GPU. All data augmentation is done by the CPU.
Also be sur to have it stored on a SSD or loading .flo files will take forever.
As far as I am concerned, I have a single 980Ti gpu, and 16 batch size with 16 jobs is enough to not loose to much time loading files.
With the right CPU you can even try very high number of jobs, try with -j32
for example.
Closing it, as flownet now can take image that are not divisible by 64
Not for the moment, but it would be easy to do so. See here for more insight