In the LSTM autoencder that you have the training and testing inputs are different. That is during training for the decoder you try to predict p(x[t+1] | x[<=t]) but in testing you rather condition it on the output of previous timesteps i.e. p(x[t+1] | y[<=t]).
This seems a bit off to me. Is it expected to be like this? Is there some reference somewhere for doing it this way?
In the LSTM autoencder that you have the training and testing inputs are different. That is during training for the decoder you try to predict
p(x[t+1] | x[<=t])
but in testing you rather condition it on the output of previous timesteps i.e.p(x[t+1] | y[<=t])
.This seems a bit off to me. Is it expected to be like this? Is there some reference somewhere for doing it this way?