Open gregorycrane opened 4 years ago
In this case, the function is brittle because it assumes we have checked for an accent rather than returning "none" or the like in the following case.
display_accentuation(get_accentuation('δ’')) Traceback (most recent call last): File "
", line 1, in File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/greek_accentuation/accentuation.py", line 68, in display_accentuation return accentuation.name.lower() AttributeError: 'NoneType' object has no attribute 'name'
I think the underlying issue here is that get_accentuation
expects a normalized accentuation. It's not intended to handle graves or words with an additional oxytone because of a following enclitic.
(greek-normalisation
handles that normalization step)
what crashed it was h(\, the standard nom fem sg of the relative pronoun in
οὐλομένην,ἣμυρίʼἈχαιοῖςἄλγεʼἔθηκε,
or am I missing something?
On 6/12/20 12:37 PM, James Tauber wrote:
I think the underlying issue here is that |get_accentuation| expects a normalized accentuation. It's not intended to handle graves or words with an additional oxytone because of a following enclitic.
(|greek-normalisation| handles that normalization step)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jtauber/greek-accentuation/issues/17#issuecomment-643374394, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHLVGJTBMQHV3EW7GPC6ZDRWJKUXANCNFSM4N32PIGA.
It's only ἣ in running text, though. The standalone form is ἥ and the assumption the code is making (which might be debatable but it's the assumption I made for my own work) is if you're querying for the accentuation type (e.g. is it perispomenon or paroxytone or whatever) that that's a property of the isolated accented word, not the string in running text.
In my own corpus work, I always use greek-normalisation
and generate an isolated form for tokens. I talk about it a bit in this blog post: https://jktauber.com/2018/07/23/normalisation-column-morphgnt/
If you don't want the full-on greek-normalisation
you can also just copy paste the code from https://github.com/jtauber/greek-normalisation/blob/master/greek_normalisation/utils.py which has things like grave_to_acute
and strip_last_accent_if_two
as well as its own strip_accents
too.