We apply the (functional, deterministic) normalization grammar and the pronunciation grammar to obtain the norm and raw_pron fields of the Verse message.
We then apply, in order, the variable rule, the syllable rule, the weight rule, and the hexameter rule (note that the general foot rule and the hexameter rule---a filter on the foot rule which requires the foot sequence to be a hexameter---have been merged, to minimize bookkeeping), projecting onto the output before each application so we keep the intermediate representations around in a lattice. Then, once we reach the end state, we check for failure (as in defective lines). If there has been no failure, we then compute the shortest path and work backwards via composition. We trivially apply shortest path at each stage even though the paths all have the same labeling: it is just a cheap way of obtaining a string transducer. We then "chunk" the intermediate transducer lattices to obtain the alignments, and convert these into message form.
The resulting Verse proto looks like the following (for Aen. 1.1):
This PR closes #64.
We apply the (functional, deterministic) normalization grammar and the pronunciation grammar to obtain the
norm
andraw_pron
fields of theVerse
message.We then apply, in order, the variable rule, the syllable rule, the weight rule, and the hexameter rule (note that the general foot rule and the hexameter rule---a filter on the foot rule which requires the foot sequence to be a hexameter---have been merged, to minimize bookkeeping), projecting onto the output before each application so we keep the intermediate representations around in a lattice. Then, once we reach the end state, we check for failure (as in defective lines). If there has been no failure, we then compute the shortest path and work backwards via composition. We trivially apply shortest path at each stage even though the paths all have the same labeling: it is just a cheap way of obtaining a string transducer. We then "chunk" the intermediate transducer lattices to obtain the alignments, and convert these into message form.
The resulting Verse proto looks like the following (for Aen. 1.1):
Processing each Aeneid book takes just under 2s on my cheap laptop.
Known limitations of this PR:
scansion_test.py
)