Hi, thanks for your nice paper. I met a question that in your paper you say the numbers of frames of the videos are variable. So how do you fuse the CNN output from different frames so the length of last output is a constant? Just computing the average or something else? Thank you very much.
Hi, thanks for your nice paper. I met a question that in your paper you say the numbers of frames of the videos are variable. So how do you fuse the CNN output from different frames so the length of last output is a constant? Just computing the average or something else? Thank you very much.