Hi, scholar. I want to ask you about the untrimmed video test. In your paper, it said z{z1,z2,z3,z4,...zn}. And you didn't use N time steps , you want to extract T time steps to the final classification for one video sample. And I guessed if you every time use the same set z{z1,z2,z3,z4,...zn} to find the most possible useful indexed feature , called as zj, j belongs to [0,N]. And iteratively select T times in the same z{z1,z2,z3,z4,...zn}. And aggreated the T time steps feature to classify the action.
Hi, scholar. I want to ask you about the untrimmed video test. In your paper, it said z{z1,z2,z3,z4,...zn}. And you didn't use N time steps , you want to extract T time steps to the final classification for one video sample. And I guessed if you every time use the same set z{z1,z2,z3,z4,...zn} to find the most possible useful indexed feature , called as zj, j belongs to [0,N]. And iteratively select T times in the same z{z1,z2,z3,z4,...zn}. And aggreated the T time steps feature to classify the action.