google-research / long-range-arena

Long Range Arena for Benchmarking Efficient Transformers
Apache License 2.0
710 stars 77 forks source link

Question regarding Pathfinder and Listops performance #60

Open LeoXinhaoLee opened 11 months ago

LeoXinhaoLee commented 11 months ago

Hi, thank you for releasing code for this inspiring work! When I was trying to reproduce the results of Transformer and Linear Transformer on Pathfinder32 and Listops tasks, I encountered the following problems:

(1) Transformer and Linear Transformer only got about 50% acc on Pathfinder32 task. If I replaced the fixed positional encoding (in official config) with learnable positional embedding, Transformer reached around 70%, but Linear Transformer stayed at 50%.

(2) On the Listops task, Transformer only had about 17% acc with fixed positional encoding (official config) or learnable position embedding.

Thank you very much for your help!

lucaslingle commented 8 months ago

For ListOps, I think the checkpointing is broken somehow. The average performance across runs appears to be the same as a randomly initialized model.

However, you can get the correct result for the trained model by evaluating on the test set at the end of training, rather than saving the model to a checkpoint and reloading it.

lucaslingle commented 8 months ago

@LeoXinhaoLee

By the way, did you change any other settings for Pathfinder32? I tried your suggestion but I am still getting only 50% accuracy for a vanilla Transformer, even with learnable position embeddings.

Thanks for any insights you can provide!