codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.09k stars 1.29k forks source link

Why Segment Embedding number only 3? #106

Open UTimeStrange opened 7 months ago

UTimeStrange commented 7 months ago
import torch.nn as nn
class SegmentEmbedding(nn.Embedding):  
        def __init__(self, embed_size=512):  
                   super().__init__(3, embed_size, padding_idx=0)  

This is the source code. First idx is padding, thus only 2 segment is supported. Why does Bert support 2 segments only?

songyandong commented 3 months ago

因为一次放入2个句子,需要区分哪些token属于第一个句子,哪些token输入第二个句子, 再加上padding整好三个.