I realized that an eos token [SEP] is added to each answer when creating the VQA dataset here. Since BertTokenizer already appends a [SEP] token to the end of each input text anyways, is there a reason why an additional eos token is added to each answer? (the tokenized input_ids of each answer ends with two 102s (the sep_token_id)).
I realized that an eos token
[SEP]
is added to each answer when creating the VQA dataset here. SinceBertTokenizer
already appends a[SEP]
token to the end of each input text anyways, is there a reason why an additional eos token is added to each answer? (the tokenized input_ids of each answer ends with two102
s (thesep_token_id
)).