bytedance / SPTSv2

The official implementation of SPTS v2: Single-Point Text Spotting
Apache License 2.0
124 stars 16 forks source link

code problem #5

Closed KlayMa527 closed 1 year ago

KlayMa527 commented 1 year ago

In sptsv2.py line 60 out_label = self.vocab_embed(hs[2])[-1][:,1:text_length+1].reshape(src.size(0),-1,self.num_classes)

I know the purpose of this code is to extract the predicted text, but why is the index here using [:, 1: text_length+1]? What is the special meaning of index 1 here? Use center point annotation in the code, should we start with 2. If bezier is used for annotation, it needs to be modified here.

zhangjx123 commented 1 year ago

q: why using [:, 1: text_length+1]; a: because the first output is location y, which is not important in text recognition. if using bezier annotation, 1 should be modified to 15.

KlayMa527 commented 1 year ago

I appreciate your answer , I understand it.