google-research / albert

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Apache License 2.0
3.23k stars 570 forks source link

Fix error experiments on RACE because of no all.txt #186

Closed twilightdema closed 4 years ago

twilightdema commented 4 years ago

ROOT CAUSE: Original RACE dataset downloaded from https://www.cs.cmu.edu/~glai1/data/race/ does not have all.txt inside it. Running experiments on RACE produces error.

FIX: The RACE dataset contains many .txt files. Each of them contains 1 line of JSON training record. This fix is to iterate over all .txt files and read them instead of reading from single all.txt.