Closed aghanbar closed 5 years ago
Sorry for the slow response, but each prediction is a t+1 prediction given all the previous frames, i.e. the 5th frame in the prediction output is the predicted frame after seeing the first 4 frames. It continuously makes an output prediction at each time step given the frames before as input.
I have read the paper and readme and also the code, but I am not still quite clear what the output of the network indicates. In the evaluation mode, the network receives 10 frames for each sample, and outputs 10 frames. Shouldn't it be just one frame since the aim of the network is to predict one frame (t+1 frame), in this case frame 11? Thanks!