Closed aaronfay closed 4 years ago
@aarppe is this an FST bug? I can't find any relevant citations in e.g., Jean's book.
We'll have to look into what exactly is the source for the grammatical preverb kâ-ki- (this was implemented early on, with general input from Arok, but our understanding of Plains Cree has evolved significantly since then - my best recollection is that it's a rare form/variant, but it is possible it is a misunderstanding).
The real issue here, that corresponds to the title, is that the current organization of the FST results a somewhat unsatisfactory analysis and generation scheme for some less frequent grammatical preverbs.
Namely, the most common grammatical preverbs indicating (sort-of) tense, kî- and wî- (for both Independent and Conjunct) and ka- for Conjunct, are analysed and specified as tense features, i.e. +Prt
, +Fut+Int
, and +Fut+Def
, presented after the lemma.
kî-nipâw nipâw+V+AI+Ind+Prt+3Sg
wî-nipâw nipâw+V+AI+Ind+Fut+Int+3Sg
ka-nipâw nipâw+V+AI+Ind+Fut+Def+3Sg
The less frequent grammatical preverbs (which often denote some combination of tense, aspect and modality), are analyzed as preverb features presented before the lemma, and the stem+suffix segment is analyzed (unsatisfyingly) as a present tense form - which, of course, clashes with any tense implied by such a grammatical preverb. In such cases, and perhaps more generally as well, the present tense form should be understood as the unmarked form. This is a matter that we have noted explicitly with Wolvengrey a few years back, when creating our first full paradigms itwêwina.
One way of dealing with this would be for us to create a separate analysis/generation for the unmarked form when there are other grammatical preverbs (than kî-, wî-, ka- noted above). Another way would be to dispense with the treatment of the most common grammatical preverbs as tense features, and treat them similar to all other grammatical preverbs as prefixal features, i.e. PV/ki+
, PV/wi+
, and PV/ka+
.
The latter option is what has been chosen by us in the modeling of other Algonquian languages, and was based on our observation of the problematicity of the current Plains Cree model. However, what kept me/us from going for this option is that other applications rely on the FST specification being as it is, namely itwêwina and nêhiyawêtân. In the case of itwêwina that would just be a chunk of work as we control all the relevant specification elements, but nêhiyawêtân is a much more rickety application, and we have wanted to have that available for demo purposes for the time being.
I'm inclined towards us treating all (grammatical) preverbs similarly at some point, but that change will require changes in multiple places at the same time, so it has to be scheduled appropriately. RIght now this can be understood as an unsatisfactory "feature" of the Plains Cree FST.
@aarppe Thank you for the detailed explanation, I fully understand the impact of technical debt in a project and appreciate how things have evolved.
With that, your detailed explanation fully answered my question 🙏! I can now generate the form I was expecting by changing the analysis from Prs
to Prt
as you describe:
❯ echo PV/kaa+ohkomiw+V+AI+Cnj+Prt+1Sg | hfst-optimized-lookup --silent crk-normative-generator.hfstol
PV/kaa+ohkomiw+V+AI+Cnj+Prt+1Sg kâ-kî-ohkomiyân
@eddieantonio I think we can close this out for now as Antti has satisfied my question, and it looks like it relates to a bigger refactor down the road.
Just chiming in on this issue again in case it is of any use:
We'll have to look into what exactly is the source for the grammatical preverb kâ-ki- (this was implemented early on, with general input from Arok, but our understanding of Plains Cree has evolved significantly since then - my best recollection is that it's a rare form/variant, but it is possible it is a misunderstanding).
I had the opportunity today to meet with Dr. Wolvengrey and I made reference to this conversation as a question, Dr. Wolvengrey commented that he was not aware of a kâ-ki-
verb form, or at least that there were no attested examples of such.
Reopening for discussion.
I've further looked into this a bit. In the Ahenakew-Wolfart corpus we have found no instance of kâ-ki-
. I've also found no record of kâ-ki-
as a distinct preverb and should not be included as a construction. I believe that this is a typo and was, in fact, referring to ka-kî-
(abilative/"can").
Given that we have ka-kî-
implemented. I suggest we remove this item from our FST.
Yes, the inclusion of kâ-ki
(incorrect) as a distinct preverb, similar to ka-kî-
(correct) was likely the result of us thinking that was a legitimate possible variant, than actually coming from any sources. So, we should remove kâ-ki
, as it is creating lots of ambiguity in addition to being incorrect.
addressed in crk/src/morphology/incoming/affixes/verbs_affixes.lexc
.
🙌
I'm going to carry over our conversation from #6 and open this up as a bug:
I believe the FST analysis is incorrect for the form
kâ-ki-
:~The analysis marks this as
Prs
(past) but the implementation is-ki-
when the past marker in Plains Cree is-kî-
.~ My mistake,Prt
is the 'past' analysis marker in the FSTs. With that however, I am still concerned this analysis is incorrect:I've double-checked several references to be certain, I cannot find
kâ-ki-
in any of my references however there are several examples ofkâ-kî-
in both Freda Ahenakêw's works as well as Arok Wolvengrey's thesis, for example:There are 16 examples in total that I could find just in that paper alone.
Please let me know if you need references to further examples.