Closed tlynn747 closed 3 years ago
-- Just a comment, sorry I cant offer a fix -- This issue has caused a lot of Irish-language experts a lot of trouble - the terminology committee have to spend a lot of time on constructions like this. Some can be worked out manually with considerable effort, others can only really be disambiguated by the speaker/writer really. Why? In the broader sense, which POS is chosen can have implications on a séimhiú being used on the modifier or no where the head noun is feminine. Also: According to some/former Irish grammars/standards those forms that could take a séimhiú need not in every situation, as a means of differentiating between meanings when the modifier is a noun.
So you would be right to say automating the clean up will be problematic.
I'm going through these now; there are about 1000 ambiguous cases. Most are clear in context so far... I'll post any tricky cases here if necessary.
Unclear whether NOUN our ADJ should be used in some case. Might not be so easy to automate this cleanup...
Inconsistencies arose in original source data (3000 pos-tagged corpus).
1207 tíortha forbartha forbartha forbartha ADJ Adj VerbForm=Part or 1426 cúrsaí forbartha forbartha forbairt NOUN Noun Case=Gen|Gender=Fem|Number=Sing
sent 1248 Arm Slánaithe
Cosanta - another example