Question about the shape of `X_train`

openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"

https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

MIT License

2.14k stars 499 forks source link

Question about the shape of `X_train` #9

Open FrankWork opened 6 years ago

FrankWork commented 6 years ago

X_train = tf.placeholder(tf.int32, [n_batch_train, 2, n_ctx, 2]) xmb[:, :, :, 1] = np.arange(n_vocab+n_special, n_vocab+n_special+n_ctx) why there is a channel of additional tokens?

FrankWork commented 6 years ago

Problem solved! This part of the xmb is used for the learned positional encoding. https://github.com/huggingface/pytorch-openai-transformer-lm/issues/12#issuecomment-401770634