Open linan142857 opened 4 years ago
Yes. We regard '##LTLine##' as a special token during train and predict.
Yes. We regard '##LTLine##' as a special token during train and predict.
Hi! Could you please tell integer identifiers of ##LTLine##
and ##LTFigure##
tokens within LayoutLM's vocabulary?
Thanks
In fact, we did not add them to the vocabulary. They will also be tokenized into tokens and labeled in the way I mentioned at #25.
Thanks
Dear author, For some documents that contain massive not-text elements, such as hundreds of thousands of "##LTLine##". How do you deal with them actually? For example, you try to train&predict all those elements with text '##LTLine##'.
Thank you!