Thank you for your awesome work and I enjoy reading it. When I tried to run your code, I found two questions:
How long does it spend to train on AG dataset under the PredCLS settings? I run _test_netrel.py with one 3090 GPU finding that the _runinference process would take about 6 hours and that's just the test process. May I ask how long the whole training process took?
For each center frame in the same video, the model would compute the temporal features with I3D. However, the sampled frame segment could be highly similar for different center frames from the same video. I wonder that if some of the I3D computation is redundant?
The training took me within 1 day on 2080ti. I have no 3090 so maybe I couldn't obtain the exact time cost on it. Yet I think it's normal to spend 6 hours for testing.
We take I3D feature for fair comparison on VidVRD, as we mentioned in Table 8 of this paper, and the baselines showed that I3D feature could bring some improvement on this dataset.
I3D could capture some slight motion information though it seems to be dominated by appearance features.
Thank you for your awesome work and I enjoy reading it. When I tried to run your code, I found two questions: