jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
MIT License
342 stars 40 forks source link

Usage with BigBird-Roberta-Base #36

Open jordanparker6 opened 1 year ago

jordanparker6 commented 1 year ago

Would it be possible to use LiLT with BigBird-Roberta-Base models?

If so, any feedback on the best approach of doing so? What might need changing in the LiLT repository to do so?

https://huggingface.co/google/bigbird-roberta-base

jordanparker6 commented 1 year ago

I was able to use the provided script to create a lilt-roberta-base-en using the following: https://huggingface.co/google/bigbird-roberta-base. If I can get this working, I will post up to HuggingfaceHub.

BigBird uses the same tokenizer as roberta so no issue with tokenizationgoogle/bigbird-roberta-base.

However, the following error occurs when loading the model.

RuntimeError: Error(s) in loading state_dict for LiltForTokenClassification: size mismatch for lilt.layout_embeddings.box_position_embeddings.weight: copying a param with shape torch.Size([514, 192]) from checkpoint, the shape in current model is torch.Size([4096, 192]). You may consider addingignore_mismatched_sizes=Truein the modelfrom_pretrainedmethod.

I think this error is created when the pytorch state dicts are fused with the following line.

total_model = {**text_model, **lilt_model}

The lilt_model dim changes the incoming bigbird dim.

Would it be problematic to switch this:

total_model = {**lilt_model, **text_model }

Or would this break the architecture?