Closed tobymarsden closed 2 years ago
@dimus here we go; this is updated to preserve diaereses (but not other diacritics) with the -D
option. It applies to details, normalized and canonical with the exception of stemmed, which has the diaereses removed (but transliterated directly, i.e. ö
-> o
, which produces correctly spelled names in my test corpus). Other diacritics (including e.g. ö
not preceded by a vowel) are transliterated as they are currently (oe
in this example).
I've applied this to the web interface too. The ronn
command needs re-running I'm afraid because ronn's dependencies seem to be broken at the moment and I couldn't easily install it.
As always let me know what modifications you need and I'll hop on it. Thanks!
Looks good to me @tobymarsden ! Merging and rebuilding man pages
This also for discussion, and to flesh out what an option to preserve diacritics could look like. (API interface etc not implemented yet, though if this idea has legs I'd be very happy to take that on.)