google-research / long-range-arena

Long Range Arena for Benchmarking Efficient Transformers
Apache License 2.0
711 stars 77 forks source link

How to run test #3

Closed da03 closed 3 years ago

da03 commented 3 years ago

This might be a dumb question, but it seems that there's only a train.py which trains and prints validation stats. How to test the model to get numbers comparable to accuracy numbers in the table?

vanzytay commented 3 years ago

Some tasks (like imdb) do not have validation sets, so the results on "val" are tests. On other tasks, you the test sets are usually also produced as a tensorflow dataset object. So you may simply replace "val" with "test" in the main training script or copy the eval loop once more to eval on test.

cifkao commented 3 years ago

Should we expect evaluation scripts/notebooks to be added soon?

cifkao commented 3 years ago

See #8 for a fix.

BalloutAI commented 2 years ago

@cifkao, the test_only is giving very low accuracy on listops, while the validation accuracy during training is much higher. Any idea why?