Closed ZHEGG closed 1 year ago
Hi,
the provided LiLT-based
checkpoint is trained as described in the origin paper. It should be combined with the off-the-shelf plain text pre-trained models for fine-tuning.
I went over the paper again, so LiLT-based initialize the text flow from the existing pre-trained English RoBERTa-BASE and when finetuning, it need to load RoBERTa-BASE again as the paper say "combine LiLTBASE with a new pre-trained RoBERTaBASE for finetuning" If I am right, when finetuning in English, why reload RoBERTa-BASE again, it seems to be a little redundant
It's a nice question. As you said, the pre-trained LiLT can be directly applied in English without re-load the text-part weights, and it can get a better result. However, we reload English RoBERTa-BASE again for consistency comparison in different languages.
OK,I see, thanks for your reply
Thanks for this amazing code! I have some question about LiLT-based, does it mean that the text stream is not combined with any pre-trained language model, and is trained from scratch with the layout stream?