Text Classification Configuration: Paper vs Code

google-research / long-range-arena

Long Range Arena for Benchmarking Efficient Transformers

Apache License 2.0

710 stars 77 forks source link

Text Classification Configuration: Paper vs Code #21

Closed adamsolomou closed 3 years ago

adamsolomou commented 3 years ago

Hi, for the byte-level document classification task there seems to be a discrepancy between the paper (see Appendix 1.2) and the config file in the repository.

Paper

6 layers, 8 heads, 512 hidden dimensions, d=2048 for positional FFN

Code

config.emb_dim = 256
config.num_heads = 4
config.num_layers = 4
config.qkv_dim = 256
config.mlp_dim = 1024

Could you please resolve this?

This is also the case for other tasks, e.g. Image Classification

MostafaDehghani commented 3 years ago

Configs for reproducing the results are now available at: https://github.com/google-research/long-range-arena/tree/main/lra_benchmarks/text_classification/configs

Feel free to reopen the issue if there was any further questions.