Closed kazucmpt closed 5 years ago
Yes, when training AT, we use ground truth gaze position. When testing we use predicted gaze position.
Thank you.
How did you decide the predicted gaze point of the first frame g_0 during testing?
We don't need g_0 when testing. g_0 is same with SP output.
I am confused. In testing, in order to know g_1, we need g0. But we need F{-1}^s and F_{-1}^t to know g0. And we can not define F{-1}^s and F_{-1}^t. Do you mean that you define g0 as output of SP module because we do not have to know F{-1}^s and F_{-1}^t to know G_t^s?
Yes, that is actually what we expect.
I understood well. Thank you.
I have a question about Channel weight extractor in Attention Transition Module. In Channel weight extractor, we need the coordinate of predicted gaze point in the previous frame g_{t-1}.
In your paper in Section 3.4, you wrote g{t-1} is the PREDICTED gaze point. But, in AT.py, you use the function "computeAAEAUC" to know g{t-1} and it seems that you find g_{t-1} by only sample['gt'] which means GRAND TRUTH.
So, my understanding is below: During training, you use the gaze point of GRAND TRUTH in the previous frame. During testing, you use the PREDICTED gaze point in the previous frame.
Is it correct?