Why the θr is updated with momentum?A

fangruizhu / self_sup_semiVOS

28 stars 0 forks source link

Why the θr is updated with momentum?A #3

Open houhouhouhou11 opened 4 years ago

houhouhouhou11 commented 4 years ago

Hello: thanks for your fantastic work .i have a question about that Why the θr is updated with momentum?

thank you very much!

houhouhouhou11 commented 4 years ago

Why not the θr is updated backpropagation? Thank you very much?

fangruizhu commented 4 years ago

@houhouhouhou11 Hi~ We update θr with momentum so that it doesn't need back-propagation. And in that way, we can maintain multiple frames at the same time without taking up GPU memory.

houhouhouhou11 commented 4 years ago

@fangruizhu Hi，thanks for your reply . In table 1,you re-implement the pairwise model from MAST (59.6 vs 60.4).I want to know why your result is so good and have you added anything else? Thank you very much!

fangruizhu commented 4 years ago

@houhouhouhou11 Well, actually it is a bit tricky. We use a larger input image size with 384x384 (256x256 in MAST) and add some data augmentation (basically random color distortion).

houhouhouhou11 commented 4 years ago

@fangruizhu Hi,thank you very much! In Tabel 1,you re-implement the pairwise model from MAST (59.6 vs 60.4) ,Whether you used the input size with (384384) or (256256) to get the 59.6. In the same way ,whether you used data augumentation to get the 59.6 . Because i have run the MAST by only using one reference frame ,but the result i got is lower than 59.6. Thank you very much!(^_^)

fangruizhu commented 4 years ago

@houhouhouhou11 Hi~ I copy the result (59.6) directly from MAST(table 5: only short memory). BTW, do you use LAB image as input and perform color channel dropout? This is quite important.

houhouhouhou11 commented 4 years ago

@fangruizhu Thanks for your reply. i use the color space is same to author. How many reference frames do you use to get the result in table 1 (J&F 60.4)? thank you very much!(^_^)

fangruizhu commented 4 years ago

@houhouhouhou11 Hi, I use 1 reference frame in that case, where only pairwise matching is performed.