Closed haileyschoelkopf closed 1 year ago
I remember testing T0 on Bigbench with Eval Harness last year. This probably could save us time and effort.
Oh that would be awesome if so. was this in mesh-tensorflow / t5x or in Huggingface?
https://colab.research.google.com/github/google/BIG-bench/blob/main/bigbench/bbseqio/docs/t5x_eval.ipynb This seems to be the script to use?
We want to be able to evaluate our models on BIG-Bench Lite. The BIGBench code is not the most outsider-friendly so I'll try to add the BigBench lite tasks to the eval-harness, and test on GPT2 for equivalence to confirm the scores should transfer back to the original implementation.
So far: have looked through the BigBench HF code a bit.