Open sydney0zq opened 5 years ago
Hi, thanks for releasing the excellent work.
However, I have several points that cannot figure out well.
Firstly it is in the
modules/pretrain_opts.py
andoptions.py
. As we can see, there are parameters likepadding_ratio
andjitter
. I also found in https://github.com/IlchaeJung/RT-MDNet/blob/master/modules/data_prov.py, we compute the extra padding area to get a larger image and then we use jitter to scale this image, finally we crop the positive and negative regions. This operations are also applied when online tracking.Is this just a means of augmentation? Or because MDNet's conv layers have no padding, therefore you add some uncertain padding to enlarge the origin image size? I cannot figure it out and it seems that your paper doesn't explain it at all. Can you please why we do this padding and jitter operations?
Thank you very much!
Thank you for your interests. Padding is used for each patch of candidate in MDNet. It has an effect to observe the boundary of object in online precisely. Without the padding, detected target size becomes smaller and smaller while detected target size becomes bigger otherwise, with in-proper padding size. Therefore, in RT-MDNet, we also apply padded image patch for each object proposal.
For jittering, you're right. it is for means of data augmentation and cover the range of step-size between elements of activations.
@sydney0zq I think the padding is a patch including all neg_examples., such as padded_x1 = (neg_examples[:,0]-neg_examples[:,2](opts['padding']-1.)/2.).min() padded_y1 = (neg_examples[:,1]-neg_examples[:,3](opts['padding']-1.)/2.).min() padded_x2 = (neg_examples[:,0]+neg_examples[:,2](opts['padding']+1.)/2.).max() padded_y2 = (neg_examples[:,1]+neg_examples[:,3](opts['padding']+1.)/2.).max()
Hi, thanks for releasing the excellent work.
However, I have several points that cannot figure out well.
Firstly it is in the
modules/pretrain_opts.py
andoptions.py
. As we can see, there are parameters likepadding_ratio
andjitter
. I also found in https://github.com/IlchaeJung/RT-MDNet/blob/master/modules/data_prov.py, we compute the extra padding area to get a larger image and then we use jitter to scale this image, finally we crop the positive and negative regions. This operations are also applied when online tracking.Is this just a means of augmentation? Or because MDNet's conv layers have no padding, therefore you add some uncertain padding to enlarge the origin image size? I cannot figure it out and it seems that your paper doesn't explain it at all. Can you please why we do this padding and jitter operations?
Thank you very much!