Closed tkjUser closed 1 year ago
Hi! Thank you for your thorough investigation and for pointing this out!
Apologies for taking so long to get around to sorting this out. Only minor changes to exp_main.py
and LSTM.py
were required to resolve the issue. When transferring the code to this public repository I simply forgot to check the teacher forcing properly as it was not used for the final graph-based models that do not have decoders. I therefore missed some key functionality from my original code, feeding the true values batch_y to the LSTM instead of the placeholders. For validation on lines 96 and 97 of exp_main.py
, we do not feed the true values as we do not want teacher forcing for validation and testing.
I will close this issue as I believe it is resolved now, but please re-open it if you find any other bugs.
Hello, it's me again! Thanks for the patient answer before. I have two questions to ask you.
exp_main.py
file, lines 96 and 97 of the code,dec_zeros
takes the value of the last value repeated pred_len times in the past, while the original Autoformer codebase uses the value of 0 as the mask, why should the duplicate value of the last value be used as the mask?LSTM.py
, according to the data division of theexp_main.py
code, the composition of the inputx_dec
is:label_len
past values,pred_len
last repeat of past values. That is,x_dec
does not contain future values. In the case of "mixed_teacher_forcing", the future values are needed as "teacher" for supervised learning, and the corresponding code in your code for "teacher_forcing" is:dec_inp = target[:, t, :]
, which means that the last repetition of the past value is used as the supervised "teacher" instead of the future real value.Is this the right approach?