MCG-NJU / TRACE

[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation
Other
58 stars 5 forks source link

Questions about training time and redundant calculation #12

Closed qncsn2016 closed 2 years ago

qncsn2016 commented 2 years ago

Thank you for your awesome work and I enjoy reading it. When I tried to run your code, I found two questions:

  1. How long does it spend to train on AG dataset under the PredCLS settings? I run _test_netrel.py with one 3090 GPU finding that the _runinference process would take about 6 hours and that's just the test process. May I ask how long the whole training process took?
  2. For each center frame in the same video, the model would compute the temporal features with I3D. However, the sampled frame segment could be highly similar for different center frames from the same video. I wonder that if some of the I3D computation is redundant?
tyshiwo1 commented 2 years ago

Thank you for your interests in our work.

  1. The training took me within 1 day on 2080ti. I have no 3090 so maybe I couldn't obtain the exact time cost on it. Yet I think it's normal to spend 6 hours for testing.
  2. We take I3D feature for fair comparison on VidVRD, as we mentioned in Table 8 of this paper, and the baselines showed that I3D feature could bring some improvement on this dataset. I3D could capture some slight motion information though it seems to be dominated by appearance features.
qncsn2016 commented 2 years ago

Thanks for your reply!