how to build a LiLT RobertaXML model with LayoutLMv3 tokenizer

As title I want to understand how we can create LiLT RobertaXML model with LayoutLMv3 tokenizer.

The version SCUT-DLVCLab/lilt-roberta-en-base uses a LayoutLMv3 tokenizer, but the version of SCUT-DLVCLab/lilt-infoxlm-base don't use a roberta tokenizer.

So I want to discovery how to do that, I've already training a lilt model for italian using the official code https://github.com/jpWang/LiLT, but I've used the roberta tokenizer (italian version) and I'm pretty sure that if I try to replace the roberta tokenizer with the LayoutLMv3 tokenizer, the code will broken.

Have anyone tried to do that?

NielsRogge / Transformers-Tutorials

how to build a LiLT RobertaXML model with LayoutLMv3 tokenizer #444