Open jordanparker6 opened 1 year ago
I was able to use the provided script to create a lilt-roberta-base-en using the following: https://huggingface.co/google/bigbird-roberta-base
. If I can get this working, I will post up to HuggingfaceHub.
BigBird uses the same tokenizer as roberta so no issue with tokenizationgoogle/bigbird-roberta-base
.
However, the following error occurs when loading the model.
RuntimeError: Error(s) in loading state_dict for LiltForTokenClassification: size mismatch for lilt.layout_embeddings.box_position_embeddings.weight: copying a param with shape torch.Size([514, 192]) from checkpoint, the shape in current model is torch.Size([4096, 192]). You may consider adding
ignore_mismatched_sizes=Truein the model
from_pretrainedmethod.
I think this error is created when the pytorch state dicts are fused with the following line.
total_model = {**text_model, **lilt_model}
The lilt_model dim changes the incoming bigbird dim.
Would it be problematic to switch this:
total_model = {**lilt_model, **text_model }
Or would this break the architecture?
Would it be possible to use LiLT with BigBird-Roberta-Base models?
If so, any feedback on the best approach of doing so? What might need changing in the LiLT repository to do so?
https://huggingface.co/google/bigbird-roberta-base