openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
MIT License
2.14k stars 499 forks source link

What is the meaning of M in the inputs #12

Closed ioana-blue closed 6 years ago

ioana-blue commented 6 years ago

From what I can tell, for the RoCstories code, M is initialized to 1 here: https://github.com/openai/finetune-transformer-lm/blob/master/train.py#L243

And the only place where it's used in the model (after a reshape) is here: https://github.com/openai/finetune-transformer-lm/blob/master/train.py#L179

What is M supposed to represent/encode?

madisonmay commented 6 years ago

M is short for mask -- it's used to mask the language modeling loss to only include as many tokens as were in the input sequence. See https://github.com/openai/finetune-transformer-lm/blob/master/train.py#L179 for where it's used.

ioana-blue commented 6 years ago

Excellent, thank you! It would be great if ML programmers were more generous with comments :) or use just a tiny bit longer variable names that help with code readability. Thank you for your clarification, I appreciate it!