openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.58k stars 5.53k forks source link

conv1d naming? #165

Closed ZeweiChu closed 5 years ago

ZeweiChu commented 5 years ago

I am wondering why is this function https://github.com/openai/gpt-2/blob/master/src/model.py#L50 named conv1d?

It seems to be a linear transformation to me, not a conv1d operation.

WuTheFWasThat commented 5 years ago

yeah, it could just be called linear. since we're operating on sequences but acting independently on the sequence dimension, it can also be thought of as a conv with filter size 1, hence the name