stickeritis / sticker

Succeeded by SyntaxDot: https://github.com/tensordot/syntaxdot
Other
25 stars 2 forks source link

Rewrite pretrain loop to support validation every N steps #162

Closed danieldk closed 4 years ago

danieldk commented 4 years ago

The main training loop now iterates over a given number of steps rather than epochs. In each iteration of the loop:

Since specifying the overal training length in terms of steps (--steps) is often not convenient, the length can also be specified in the number of epochs using the --epochs option.

If the training length is specified in the number of epochs, a quick pass over the pretraining data is done to count the number of sentences.

Currently, no Save type is used, because the interface of the savers is currently not convenient for the pretraining scenario.

Fixes #143


Sorry for the big diff + these training loops are always very imperative and messy :(.