Teacher forcing and standard inference tests where the output itself is quantitatively evaluated. The ground truth outputs and test model are being stored in Google drive. Due to what appears to be test environment related numerical instability, correlation is used to assess inference quality. Teacher forced inference does appear to be working - these tests should also be useful for merging the batch class PR, once a quantitative training test has been added as well.
Teacher forcing and standard inference tests where the output itself is quantitatively evaluated. The ground truth outputs and test model are being stored in Google drive. Due to what appears to be test environment related numerical instability, correlation is used to assess inference quality. Teacher forced inference does appear to be working - these tests should also be useful for merging the batch class PR, once a quantitative training test has been added as well.