Open zhihuacc opened 2 years ago
We remove [SEP] for a single sentence input because it has negligible effect on pre-training.
But why get_special_tokens_mask still appends a [1]. I thought this [1] is for [SEP], right ?
Yes you are right, I have modified the code so that the [1] is not appended. Thank you!
Hi, I found in this line[build_inputs_with_special_tokens](https://github.com/salesforce/ALBEF/blob/75376bee33df87af9c206b4afb53c876927e7b2b/models/tokenization_bert.py#L294) the returned list is appended a [1] at the end for a single input sequence, while the returned list [here](https://github.com/salesforce/ALBEF/blob/75376bee33df87af9c206b4afb53c876927e7b2b/models/tokenization_bert.py#L262) is NOT appended a [SEP] for the same case. Why is that ?