google-research / bigbird

Transformers for Longer Sequences
https://arxiv.org/abs/2007.14062
Apache License 2.0
563 stars 101 forks source link

Question about pre-trained weights #1

Closed patrickvonplaten closed 3 years ago

patrickvonplaten commented 3 years ago

Thanks so much for releasing BigBird!

Quick question about the pre-trained weights. Do the bigbr_large and bigbr_base correspond to BERT-like encoder-only checkpoints and bigbp_large to the encoder-decoder version?

manzilz commented 3 years ago

Yes, you are correct. I should have provided a more detailed documentation.

manzilz commented 3 years ago

I have updated the readme and I am closing this issue for now. Feel free to re-open if there are any further questions.

patrickvonplaten commented 3 years ago

Awesome thanks so much for the quick reply :-)