roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Apache License 2.0
284 stars 57 forks source link

Question about [GO] and [s] #26

Closed sparrow0629 closed 2 years ago

sparrow0629 commented 2 years ago

Hi, thanks for your amazing work. When you convert the label using class TokenLabelConverter, you pad the label with [GO] which is ignored during loss calculation, however in paper, figure 4 shows that the label is padded with [s]. Does this make any difference on accuracy?

roatienza commented 2 years ago

[GO] is the start token. After the predicted text, succeeding tokens are spaces that are represented by [s].