Open jowagner opened 3 years ago
Find out why we did not succeed in training a usable electra model with 24 hour computation budget in issue #76 when the 48 hour BERT models perform ok and electra is supposed to reach good performance levels much more quickly.
Reading:
Find out why we did not succeed in training a usable electra model with 24 hour computation budget in issue #76 when the 48 hour BERT models perform ok and electra is supposed to reach good performance levels much more quickly.
Reading: