seominseok0429 / Implicit-Stacked-Autoregressive-Model-for-Video-Prediction

Implicit Stacked Autoregressive Model for Video Prediction (official implementation)
84 stars 6 forks source link

Why does the input to the model contain ground-truth??? #1

Closed LintureGrant closed 1 year ago

LintureGrant commented 1 year ago

The following code is located in your valid function: pred_y = self.ema_model(batch_x, batch_y, t) I have a question: can predictive models take Ground-Truth as input? If the GT is the input of the model in the validation, why not output the GT directly? This means the predictive results in your paper are generated by _batchx and _batchy, so the performance is very good, even with a performance improvement of nearly 250% than SimVP on human datasets. If you directly output the GT, the MSE will become 0! This is ridiculous. Although the test function without GT as input, it simply won't run after the training and validation for the lack of ground-truth pred_y = self.model(batch_x.to(self.device)) Your model does not work without GT, this is a fatal problem that your model is completely unable to predict future frames. In other words, if I already know the actual value to be predicted in advance, why do I still need a model to predict?

seominseok0429 commented 1 year ago

Thank you for your interest in our work.

I haven't posted our inference code yet!

It will be uploaded by this week.

If you need to work in advance, you can write inference code in a stacked autoregressive manner.

Thank you!

seominseok0429 commented 1 year ago

@LintureGrant Now, I released the test code!