allenai / longformer

Longformer: The Long-Document Transformer
https://arxiv.org/abs/2004.05150
Apache License 2.0
2k stars 268 forks source link

Adapting this repo to current version of transformers library #220

Open MorenoLaQuatra opened 2 years ago

MorenoLaQuatra commented 2 years ago

Hi,

First of all, thank all of you for the amazing work! I was wondering if there is any plan to update this repo to work with the latest version of HF transformers library.

When I try to load the model I got this kind of error (similar errors are reported in other issues):

RuntimeError: Error(s) in loading state_dict for LongformerEncoderDecoderForConditionalGeneration:
    size mismatch for model.encoder.embed_positions.weight: copying a param with shape torch.Size([4098, 1024]) from checkpoint, the shape in current model is torch.Size([1026, 1024]).

Is there an effective way to make it work?

hyesunyun commented 2 years ago

Hi!

I get the same issue as well. Would love to know how to resolve this issue. I am trying to convert a custom pretrained BART as longformer encoder decoder and face this issue.

cppww commented 2 years ago

@MorenoLaQuatra Same issue for me. Have you solved this problem? I lowered transformers' version to 3.1.0 and it still doesn't work.

MorenoLaQuatra commented 2 years ago

I needed to use the pretrained longformer model so I just followed the instructions to install the specific longformer version (and it worked). However I needed to create a new environment for that, I was not able to do maintaining the updated transformers version.

I think that who just want to use longformer encoder decoder can go with LED model available on transformers now.