Closed jcliu0428 closed 4 years ago
Hi, thank you for your interest! this implementation is the same as it is in the original FGFA repo. And in the paper it also says "By default, we sample 2 frames in training and aggregate over 21 frames in inference.", so in training, the current frame is not accumulated. But it could be sampled by random sampling.
Hi, Thank you for your answer! I also notice this in the original paper. By the way, I have another question. I notice both your reimplementation and official MXNet code multiply the flownet output by 2.5. But in original flownet code, I have not seen this line. Could you tell me why the output flow need to multiply by 2.5?
I just follow the implementation of the original repo so I also don't know why :) Maybe you should ask the author of FGFA paper or you could try to remove the 2.5 factor and see whether the performance will drop.
Hi, Thank you for your excellent codebase! I have a small question on your implementation for the feature aggregation code for FGFA paper. In the original paper, the feature of the current frame is also accumulated for feature aggregation across nearby frames. But in this line: https://github.com/Scalsol/mega.pytorch/blob/e9d7d4fa434c84bec98e3171e783dd0c720c3fb4/mega_core/modeling/detector/generalized_rcnn_fgfa.py#L131 It seems that only the weights of nearby frames are computed. Is that correct? I have not read the whole framework. Could you take a look at it and tell me the answer?
Thanks