In this paper, the author considered efficiency, while Inference is still time-consuming. This is because each sample x with a length of n will create 8n * k templates.
The source sequence of the model is an input text X = {x1, . . . , xn} and the target sequence Tyk,xi:j = {t1, . . . , tm} is a template filled by candidate text span xi:j and the entity type yk.
For efficiency, we restrict the number of n-grams for a span from one to eight, so 8n templates are created for each sentence.
In this paper, the author considered efficiency, while Inference is still time-consuming. This is because each sample x with a length of n will create 8n * k templates.