Closed JHLew closed 1 year ago
You should check the argument n_unmasked
. For example, here:
https://github.com/SongweiGe/TATS/blob/8ea1b587a74736d420b70cc2b52ac1683682ec6c/tats/modules/gpt.py#L96
Thank you so much for the kind feedback.
I wish to check if my understanding is correct:
I was expecting the the autoregressive transformer and the interpolation transformer to be two separate modules, but are the two then not explicitly separated, but all in a single GPT model, simply with different masking methods in a self-attention layer?
That's correct!!
Hi, thanks for the awesome work.
I have been going through the codes to better understand the paper. I wish to understand how the interpolation transformer is implemented, but could not find it.
I assume the transformer(GPT) of Net2NetTransformer of tats/tats_transformer.py is the autoregressive transformer, but there does not seem to be another transformer for interpolation?
I found the Casual Attention to be existent, but not the Interpolation Casual Attention.
Is it not provided or am I missing something? Could you kindly specify where to look?