Closed KlayMa527 closed 1 year ago
q: why using [:, 1: text_length+1]; a: because the first output is location y, which is not important in text recognition. if using bezier annotation, 1 should be modified to 15.
I appreciate your answer , I understand it.
In sptsv2.py line 60 out_label = self.vocab_embed(hs[2])[-1][:,1:text_length+1].reshape(src.size(0),-1,self.num_classes)
I know the purpose of this code is to extract the predicted text, but why is the index here using [:, 1: text_length+1]? What is the special meaning of index 1 here? Use center point annotation in the code, should we start with 2. If bezier is used for annotation, it needs to be modified here.