openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.58k stars 5.53k forks source link

Batch of samples with different length #160

Closed gionapaolini closed 5 years ago

gionapaolini commented 5 years ago

Hi,

I'd like to feed short samples of various length to the model, and I would like to put them in a single batch.
Hence my question is, does the model support padding?

If yes what is the token?

WuTheFWasThat commented 5 years ago

no, you'd have to modify the code to accept a binary mask and use that to mask out the tokens you want to pad out (by changing how positional embeddings and autoregressive mask are computed)