Closed coding-kt closed 3 months ago
Hi, LiLT uses the token length 512 during the pre-training phase. The most direct way to process long document is to split the document into chunks with length 512. Alternatively, it is possible to consider linearly resizing the position embedding and then fine-tuning it.
Hi,
LiLT processes a maximum of 512 tokens.
Is there a good option to get a comparable and commercial useable model that can process more tokens?
It is of course possible to split longer inputs into 512 token chunks. But this comes with some disadvantages / difficulties.