bigscience-workshop / evaluation

Code and Data for Evaluation WG
Other
41 stars 24 forks source link

feat: Add LAMA TREX task #68

Closed JanKalo closed 2 years ago

JanKalo commented 3 years ago

Performed minor adoptions of the LAMA templates and evaluation procedure to work with Causal LM instead of a masked LM as discussed on the Slack. The quality in comparison to masked LM is much worse.

Model: GPT 2 Runtime: 75min on CPU Results: 6.98% Precision