jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
MIT License
335 stars 40 forks source link

Export model using distilroberta-base #34

Closed fracav closed 3 months ago

fracav commented 1 year ago

I'm trying to export a Lilt model that uses distilroberta-base.I get the error: Some weights of the model checkpoint at lilt-distilroberta-base were not used when initializing LiltForTokenClassification. The colab I'm testing with is the following: https://colab.research.google.com/drive/1k2uGoDBOQwrK4iokGJOQKdDfll0QPbl-#scrollTo=0ZmvE7ku4hSW.

Can you help me figure out what I'm doing wrong? Thanks in advance

logan-markewich commented 1 year ago

I took a look at your colab. I'm not sure what the exact problem is, but probably something to do with how the layers are named in distilroberta vs. roberta

I see someone has already uploaded lilt-distilroberta to huggingface. I haven't tried it, but if it works, you can compare layer names with that: https://huggingface.co/Sennodipoi/lilt-distilroberta-base/tree/main

If you look at gen_weight_roberta_like.py, all it does is rename weight layers. Maybe you will have to add some extra code for distilroberta to work

fracav commented 1 year ago

Hi Logan, also I saw the model https://huggingface.co/Sennodipoi/lilt-distilroberta-base/tree/main and this was exactly what led me to try using distilroberta as a base. I can successfully load the model in question without getting the same error. Having a distilroberta model finetuned to a specific domain I wanted to try using it as a basis for lilt.

Thank you for the answer :), I'll try to see if it's a mismatch problem between the layer names.