Closed jlim13 closed 4 years ago
Yes.
thanks for the prompt response. i have not worked in action classification before, but all it takes is 5 (or some number considerably less than the entire video sequence) to classify the video?
You can tune that number as you wish, depending on your applications or tasks.
Different types of videos have different best num_segments
. However, to have a fair comparison with other methods, people usually fix num_segments
.
I see. When I set num_segments
to something really high like 100-300, my script just crashes. Can you give some insight as to what is going on? Maybe this Ta3N isn't suitable for my problem if I need the entire input sequence to properly classify my input.
Thanks!
I guess the reason could be the computation issue. TA3N computes relations between frames, and the numbers depend on num_segments
. If num_segments
is really high (e.g. the entire input sequence), I think you need lots of GPU resources. Otherwise, you may also develop some tricks to reduce the computation, and then you and still apply the main concept of TA3N to your case.
hi,
for any given iteration, does the network use
num_segments
number of frames to classify a video?