IlchaeJung / RT-MDNet

101 stars 30 forks source link

The padding operation discussion #7

Open sydney0zq opened 5 years ago

sydney0zq commented 5 years ago

Hi, thanks for releasing the excellent work.

However, I have several points that cannot figure out well.

Firstly it is in the modules/pretrain_opts.py and options.py. As we can see, there are parameters like padding_ratio and jitter. I also found in https://github.com/IlchaeJung/RT-MDNet/blob/master/modules/data_prov.py, we compute the extra padding area to get a larger image and then we use jitter to scale this image, finally we crop the positive and negative regions. This operations are also applied when online tracking.

Is this just a means of augmentation? Or because MDNet's conv layers have no padding, therefore you add some uncertain padding to enlarge the origin image size? I cannot figure it out and it seems that your paper doesn't explain it at all. Can you please why we do this padding and jitter operations?

Thank you very much!

IlchaeJung commented 5 years ago

Hi, thanks for releasing the excellent work.

However, I have several points that cannot figure out well.

Firstly it is in the modules/pretrain_opts.py and options.py. As we can see, there are parameters like padding_ratio and jitter. I also found in https://github.com/IlchaeJung/RT-MDNet/blob/master/modules/data_prov.py, we compute the extra padding area to get a larger image and then we use jitter to scale this image, finally we crop the positive and negative regions. This operations are also applied when online tracking.

Is this just a means of augmentation? Or because MDNet's conv layers have no padding, therefore you add some uncertain padding to enlarge the origin image size? I cannot figure it out and it seems that your paper doesn't explain it at all. Can you please why we do this padding and jitter operations?

Thank you very much!

Thank you for your interests. Padding is used for each patch of candidate in MDNet. It has an effect to observe the boundary of object in online precisely. Without the padding, detected target size becomes smaller and smaller while detected target size becomes bigger otherwise, with in-proper padding size. Therefore, in RT-MDNet, we also apply padded image patch for each object proposal.

For jittering, you're right. it is for means of data augmentation and cover the range of step-size between elements of activations.

xi-mao commented 5 years ago

@sydney0zq I think the padding is a patch including all neg_examples., such as padded_x1 = (neg_examples[:,0]-neg_examples[:,2](opts['padding']-1.)/2.).min() padded_y1 = (neg_examples[:,1]-neg_examples[:,3](opts['padding']-1.)/2.).min() padded_x2 = (neg_examples[:,0]+neg_examples[:,2](opts['padding']+1.)/2.).max() padded_y2 = (neg_examples[:,1]+neg_examples[:,3](opts['padding']+1.)/2.).max()