peleiden / daluke

A Danish-speaking language model with entity-aware self-attention
MIT License
9 stars 0 forks source link

Choose learning rate during the training #39

Closed sorenmulli closed 3 years ago

sorenmulli commented 3 years ago

Test 1e-3, 2/20 1e-5, 5/20

sorenmulli commented 3 years ago

argmax likelihood = 5e-4, atm?

asgerius commented 3 years ago

6.9e-4