google-research / long-range-arena

Long Range Arena for Benchmarking Efficient Transformers
Apache License 2.0
710 stars 77 forks source link

Longformer missing in Fig. 3 #13

Closed albusdemens closed 3 years ago

albusdemens commented 3 years ago

Hello, is there a reason why you didn't include the Longformer in Fig. 3 of your paper? Cheers

vanzytay commented 3 years ago

Hi!

Longformer and Sparse Transformer require specialized cuda kernels to get the speed up/memory gain. This does not play well with hardware such as TPUs.

In our experiments, we benchmarked the quality (accuracy) of Longformer/Sparse Transformer by simulating the sparsity (which does not result in memory or speed gains). Hence, it does not make sense to include Longformer and Sparse Transformer in the plot.

Thanks.