jtauber / greek-accentuation

Python 3 library for accenting (and analyzing the accentuation of) Ancient Greek words
MIT License
56 stars 10 forks source link

How do I get rid of extra accents from enclitics: e.g., Μαῖράν #8

Closed gregorycrane closed 5 years ago

gregorycrane commented 6 years ago

I don't see a strip/simplify accent routine. Am I missing this?

jtauber commented 6 years ago

In all the work I do, I calculate normalised forms and work with those (see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/ for details in the context of the GNT).

I have code for handling much of that normalisation. It probably makes sense for me to include at least some of that in greek-accentuation.

In the short term, I've put said code in a gist: https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8

gregorycrane commented 6 years ago

Its not a big deal but you have so many usefully packaged routines that I don't want to miss something you have already done.

My previous approach was exhaustive and thus emphasized recall but normalized Greek goes a long way with the zillions of words we have to index.

On 8/19/18 2:33 PM, James Tauber wrote:

In all the work I do, I calculate normalised forms and work with those (see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/ for details in the context of the GNT).

I have code for handling much of that normalisation. It probably makes sense for me to include at least some of that in greek-accentuation.

In the short term, I've put said code in a gist: https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jtauber/greek-accentuation/issues/8#issuecomment-414146376, or mute the thread https://github.com/notifications/unsubscribe-auth/AE66mUH1mD2M7_SjRPjfPhKs3K0DRnK9ks5uSa-XgaJpZM4WDCgz.

jtauber commented 5 years ago

I've finally created https://github.com/jtauber/greek-normalisation to package up all the various normalisation stuff I've used over the years.