Closed gregorycrane closed 5 years ago
In all the work I do, I calculate normalised forms and work with those (see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/ for details in the context of the GNT).
I have code for handling much of that normalisation. It probably makes sense for me to include at least some of that in greek-accentuation.
In the short term, I've put said code in a gist: https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8
Its not a big deal but you have so many usefully packaged routines that I don't want to miss something you have already done.
My previous approach was exhaustive and thus emphasized recall but normalized Greek goes a long way with the zillions of words we have to index.
On 8/19/18 2:33 PM, James Tauber wrote:
In all the work I do, I calculate normalised forms and work with those (see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/ for details in the context of the GNT).
I have code for handling much of that normalisation. It probably makes sense for me to include at least some of that in greek-accentuation.
In the short term, I've put said code in a gist: https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jtauber/greek-accentuation/issues/8#issuecomment-414146376, or mute the thread https://github.com/notifications/unsubscribe-auth/AE66mUH1mD2M7_SjRPjfPhKs3K0DRnK9ks5uSa-XgaJpZM4WDCgz.
I've finally created https://github.com/jtauber/greek-normalisation to package up all the various normalisation stuff I've used over the years.
I don't see a strip/simplify accent routine. Am I missing this?