codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.15k stars 1.3k forks source link

how dose your code implement Bidirectional Transformers? #6

Closed mjc14 closed 5 years ago

mjc14 commented 5 years ago

hi, i am new user of pytorch, i want to know which part in your code can represent Bidirectional Transformers ? thanks .

codertimo commented 5 years ago

@mjc14 thank you for asking, it's very good question. Well I was confused the meaning of word "Bidirectional Transfomer" at first. But when I saw this description on paper, I knew that Bidirectional Transfomer = Transformer Encoder which is used in Attention All You Need

We note that in the literature the bidirectional Transformer is often referred to as a “Transformer encoder” while the left-context-only version is referred to as a “Transformer decoder” since it can be used for text generation. -BERT Paper Section 3.1 Model Architecture

So bi-directional is meaning of, Transformer(self-attention) can see the context, both left and right side. Implementation of Transformer Encoder block is here https://github.com/codertimo/BERT-pytorch/blob/master/model/transformer.py.

Please feel free to ask more, if you have any question :) have a good day

codertimo commented 5 years ago

@mjc14 if you dont have any further question, I'll close this issue in soon. thanx