google-research / long-range-arena

Long Range Arena for Benchmarking Efficient Transformers
Apache License 2.0
710 stars 77 forks source link

Are encoder and decoder both implemented with sparse attention for bigbird? How long is the verified output length for the decoder? #46

Open dongxinghua opened 2 years ago