Closed balmas closed 6 years ago
Thanks! I just checked in at the local bar with free Wifi. I am reviewing the amo conjungation. I will be ready soon to report on it.
On Mon, Aug 13, 2018 at 7:57 AM, Bridget Almas notifications@github.com wrote:
Per report from @monzug https://github.com/monzug
Sometimes the parser is not accurate in the way it represents the suffix of a form, preventing matches into the inflection tables.
For example, for monitu, as a verbal supine, Whitaker reports the suffix as 'u' but the ending in the inflection tables is 'itu'.
Should we highlight the cell if all of the morphology matches? Should we do an extra check on the form and the endings in the cell to see if they match even if the suffix doesn't?
To discuss with @abrasax https://github.com/abrasax
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alpheios-project/inflection-tables/issues/82, or mute the thread https://github.com/notifications/unsubscribe-auth/AneqOTHmk5EecdLEXBSuxBv7w5wleICxks5uQWmegaJpZM4V6Xys .
More examples from #80
miserri - mus
so we do not have any matches for this adj. (Whitaker reports ending as "mus" our table has "us")
same for "pulcherrimus" ( pulcherri - mus ) pulcherri.mus ADJ 1 2 NOM S M SUPER pulcher, pulchra -um, pulchrior -or -us, pulcherrimus -a -um ADJ [XXXAX] pretty; beautiful; handsome; noble, illustrious;
opti.mus ADJ 1 1 NOM S M SUPER bonus, bona -um, melior -or -us, optimus -a -um ADJ [XXXAO] good, honest, brave, noble, kind, pleasant, right, useful; valid; healthy;
Ok, per discussion with @abrasax we want to augment the matching in the inflection tables as follows:
if all of the morphological features match AND the suffix matches the ending of the form (regardless of whether it matches the suffix), this should be an exact match.
(I hope you had a good vacation @kirlat ... and that this request doesn't wish you were still on it...)
Hi @balmas, can we use "monitu" as an example to analyze new matching requirements? This word has exactly one supine inflection. This inflection is suffix-based, so we'll use supine suffix data for matching.
From morphological analyzer we have the following data for this inflection:
On the other hand, in a suffix data file we have the following suffixes (endings): "ātum", "itum", "tum", "ītum", "ātū", "itū", "tū".
So how should we match suffix against the ending of a form?
Hi @kirlat. There are never easy answers :-)
Generally, given that we are learning how inaccurate the suffix identification from the parsers is, I was thinking we would have to do something like the following pseudocode:
let I = < Inflection we are trying to match >
let MC = < the cell from the inflection table data whose features match I.features >
let exactMatch = false
for (cellSuffix of MC.suffixes) {
if (normalized(cellSuffix) === normalized(I.suffix)) {
exactMatch = true
}
}
if (! exactMatch) {
for (cellSuffix of MC.suffixes) {
if (normalized(I.suffix).endsWith(normalized(cellSuffix))) {
exactMatch = true
}
}
}
The example of monitu has a couple of extra gotchas though:
1) the ending we want to match is "itū" which is accented, and the form itself might not be. This is the subject of a separate issue, #60, that I think we need to address by normalizing out accents, if we aren't already
2) the Supine table has active and passive voice, where passive has no endings specified, and the parser output doesn't give us voice at all. This raises questions for whether "monitu" should be matching in both Active and Passive columns. I am wondering if the table is correct in this case, and would like to ask @monzug her thoughts: should the Supine table have voice in it?
Supine is a verbal noun that is neither declined nor combined. It has only two forms: one in um which has active meaning, the other in u of passive meaning, respectively accusative and ablative of a fourth declension name. Intentionally it was called supine because it is indifferent to time and conjugation, as those who are lying supine without caring about anything. Interesting, right? said so, we have two endings: accusative in um, active form with motion verbs like eo , venio , mitto , etc ablative in u, passive form used with few adjectives and with some nouns. The most common passive supine are: auditu = to hear, cognitu = to know each other, dictu = to say, factu = to be, existimatu = to esteem , inventu = to be, memoratu = to tell, visu = to see each other. Bridget, if we match the ending in itu, we are going to miss few of the most common passive supine verbs in the ablative.
@monzug I think we need to go through these issues together. I'm getting very confused :-) I'd like to suggest we do this at our Monday morning check-in next week, so that @kirlat can also be present to ask questions and hear answers. Are you able to join us for that? it's at 8am EDT so 2pm CEST. If not, maybe we could pick a different day/time.
in the case of Supine, we have the wrong inflection table. for once, whitaker's output is correct.
the table should be this simple:
SUPINE accusative -um ablative -ū
no stem, no conjugation, no singular or plural, no active or passive. this is the supine.
monit-u would have a match with the above supine table (and so the other supine verbs).
I went back to the original issue that I reported. at that time, I took for granted that the inflection table was correct and that Whitaker was wrong. this bug should be split in two: 1) update SUPINE inflection table 2) My suggestion and this bug (can we show a match in blue based on the morphology? can we just show the ending in the table for that specific form? so, we do not match the ending but the verbal form?) is still valid for other situations and I will provide examples in which inflection tables are correct but we do not have a match because Whitaker output is different. Sorry for the confusion. Monit-u is a wrong example for this desire behavior, but a good one for fixing the SUPINE.
I have a couple of examples, not the best ones... 1) amav-eram ---> the inflection table is wrong and Whitaker is correct, ending should be eram but in table we have averam. we have two options here, correct this table (we need to review all inflection tables and I do not know how long it will take but it should be done at one point) or show the match in blue based on morphology
2) adjective melior ----> meli-oris will never have a match with our future comparative table as the correct ending is "is" and not "oris"
ok, for amav-eram, I think we decided to fix the inflection table #84 for comparative tables we also have #68 I have entered #82 for the supine
@monzug I don't understand this request: "can we just show the ending in the table for that specific form? so, we do not match the ending but the verbal form?" Can you provide an example?
let's do miserri - mus as suggested above - forget for a sec that we will fix the superlative. whitaker says ending in -mus, so we don't have a match in our table as ending is us. we can highlight the adjective 1st declension singular nominative (m), which is "-us".
Ok. I'm going to split this item into 3 issues:
This one will be for the request as identified in the title: We want to highlight cells which have matching morphology. For purposes of the inflection table library, which is where this issue resides, it means we need to add a class to the cells whose features match those of the features of the inflection, so that presentation code can do something with it.
I will enter a separate issue in this repository for adding to the suffix-matching logic and a new issue in the components repository to highlight morphology cell matches.
test case for this from #131 Greek πρόσφυμα (morpheus parser says the suffix is μα the table has α)
This has been implemented in https://github.com/alpheios-project/webextension/archive/comp-i169.zip.
Based on list of predefined "morphology" features list for each language it finds those feature that we have in inflection and matches them against a morpheme. If a cell has at least on morpheme with all morphology feature matches, a cell gets decorated with a blueish border.
Here are the functions that define "morphology" features for each language. Right now they are the same as "optional matches" but might change that if necessary.
// Latin
static getMorphologyMatchList (inflection) {
const featureOptions = [
Feature.types.grmCase,
Feature.types.declension,
Feature.types.gender,
Feature.types.number,
Feature.types.voice,
Feature.types.mood,
Feature.types.tense,
Feature.types.person,
Feature.types.conjugation
]
if (inflection.constraints.irregular) {
return [
Feature.types.mood,
Feature.types.tense,
Feature.types.number,
Feature.types.person,
Feature.types.voice,
Feature.types.conjugation
]
} else {
return featureOptions.filter(f => inflection[f])
}
}
// Greek
static getMorphologyMatchList (inflection) {
let featureOptions = []
if ([Constants.POFS_PRONOUN, Constants.POFS_NUMERAL, Constants.POFS_ARTICLE].includes(inflection[Feature.types.part].value)) {
featureOptions = [
Feature.types.grmCase,
Feature.types.gender,
Feature.types.number
]
} else if (inflection.hasFeatureValue(Feature.types.part, Constants.POFS_ADJECTIVE)) {
featureOptions = [
Feature.types.grmCase,
Feature.types.gender,
Feature.types.number,
Feature.types.declension
]
} else {
featureOptions = [
Feature.types.grmCase,
Feature.types.declension,
Feature.types.gender,
Feature.types.number,
Feature.types.voice,
Feature.types.mood,
Feature.types.tense,
Feature.types.person
]
}
return featureOptions.filter(f => inflection[f])
}
Please let me know what do you think. Thanks!
I'm not sure if it is necessary for the features to be different than for optionalMatches. But I guess it can't hurt to have this flexibility.
@monzug your feedback would be helpful on this: should we always include the border when the morphology matches, or only when we weren't able to find an exact match on the suffix?
I'm not sure if it is necessary for the features to be different than for optionalMatches. But I guess it can't hurt to have this flexibility.
I'm afraid when we start to fine tune this they might very well become different. That's why I did not want to tie them to optional matches. Better safe than sorry, that's what my experience with inflection tables tells me :-).
Anyway, it's just a single function call in both cases. And also by looking at that morpheme match list function we can see clear rules that drive an appearance of a nice little frame around some table cells, and it does not drive anything else, so we can change it as much as we want without any side effects.
yes, I'm in favor of that. thanks for your attention to this!
cool. I have to test it to get the feeling.
On Fri, Sep 14, 2018 at 1:41 PM, Bridget Almas notifications@github.com wrote:
yes, I'm in favor of that. thanks for your attention to this!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alpheios-project/inflection-tables/issues/82#issuecomment-421432593, or mute the thread https://github.com/notifications/unsubscribe-auth/AneqOfGYiFTsyGNQiaxGFHfgclwIZtawks5ua-ozgaJpZM4V6Xys .
so so happy about this fix. Thanks!!!! tested for few words such as miserrimus, amandi, amicio, caro, optimus, plurium, etc e.g. amicio -> whitaker ending is in o, tables endings is in io and we have the cell highlighted. beautiful.
two comments here: 1) I would NOT highlight the irregular form if possible in case such as splendor or melior, but it's perfect for plurium, so I am not sure if this is possible 2) there is a complication for melior that I haven't thought of before. melioris is COMP, so it's more like an adj of 3rd decl but in morphology is part of bonus, bona, bonum that is adj of first decl. we do highlight all these cells of first decl but in reality it's measleading as it should be 3rd decl. - this should be solved by adding the COMP table (see also bug #89 Comparative adjective table and Whitaker)
Passing to Bridget, but I would say it's a great feature. worth to mention in the release note?
ah, checking the greek. why we don't have the cell highlighted for ἱκέσθαι ? also, for εἶναι" we do not have a match in the table but we have two highlighted full tables that have nothing to do with the morphology. see attachment
so so happy about this fix. Thanks!!!! tested for few words such as miserrimus, amandi, amicio, caro, optimus, plurium, etc e.g. amicio -> whitaker ending is in o, tables endings is in io and we have the cell highlighted. beautiful.
two comments here:
- I would NOT highlight the irregular form if possible in case such as splendor or melior, but it's perfect for plurium, so I am not sure if this is possible
- there is a complication for melior that I haven't thought of before. melioris is COMP, so it's more like an adj of 3rd decl but in morphology is part of bonus, bona, bonum that is adj of first decl. we do highlight all these cells of first decl but in reality it's measleading as it should be 3rd decl. - this should be solved by adding the COMP table (see also bug #89 Comparative adjective table and Whitaker)
Passing to Bridget, but I would say it's a great feature. worth to mention in the release note?
Hmm. Maybe both of these, not highlighting morphology-only matches for irregular and comparatives could be handled by adding the ability to specify features in either the inflection table data or the lexical query result data which invalidate a match. Right now we specify the features which MUST be present to match, this would be adding features which CANNOT be present. @kirlat what do you think? Is the question clear?
ah, checking the greek. why we don't have the cell highlighted for ἱκέσθαι ? also, for εἶναι" we do not have a match in the table but we have two highlighted full tables that have nothing to do with the morphology. see attachment
this was an early attempt to make sense of the linked tables for greek verb paradigms, which have that "see declension" link. A few of the tables have multiple of those, and we were trying to distinguish which was which for the user. I'd be fine with dropping the highlighting if it's confusing though.
entered #152 for the matching problem on ἱκέσθαι and εἶναι
thinking on this more, and the meaning of the highlight, I think I'd like to decide to drop the highlighting on the "see declension" linked tables.
So would it, with highlighting disabled, look approximately like below? If the above look is correct, I will include this change into a PR then.
I think so
On Mon, Sep 17, 2018 at 11:08 AM, kirlat notifications@github.com wrote:
So would it, with highlighting disabled, look approximately like below? [image: image] https://user-images.githubusercontent.com/18631055/45631630-5d76ad00-baac-11e8-956f-b689c5a8ea90.png If the above look is correct, I will include this change into a PR then.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alpheios-project/inflection-tables/issues/82#issuecomment-422052932, or mute the thread https://github.com/notifications/unsubscribe-auth/AneqObEoAn7qt4EWLqB3EysrsQkuWsMrks5ub7rkgaJpZM4V6Xys .
I think that would be fine too
- I would NOT highlight the irregular form if possible in case such as splendor or melior, but it's perfect for plurium, so I am not sure if this is possible
- there is a complication for melior that I haven't thought of before. melioris is COMP, so it's more like an adj of 3rd decl but in morphology is part of bonus, bona, bonum that is adj of first decl. we do highlight all these cells of first decl but in reality it's measleading as it should be 3rd decl. - this should be solved by adding the COMP table (see also bug #89 Comparative adjective table and Whitaker)
Hmm. Maybe both of these, not highlighting morphology-only matches for irregular and comparatives could be handled by adding the ability to specify features in either the inflection table data or the lexical query result data which invalidate a match. Right now we specify the features which MUST be present to match, this would be adding features which CANNOT be present. @kirlat what do you think? Is the question clear?
@balmas, I think implementation should depend on whether we can define those rules as a set of logical conditions such as "if word is irregular or if it has featueOneValue === A and featureTwoValue === B then skip full morphology match analysis" or whether it is a case by case situation like "skip analysis for this ending of this form of this Latin POFS".
My understanding is it's more of the former (please correct me if I'm wrong here). Then I think it's better to have a function that will define those rules so we could check each time whether we need to do a full morphology match analysis or not. This logic probably belongs to Latin or Greek datasets of an inflection tables library where we concentrate all language specific rules. So then we can do something like:
if (langDataset.shouldDoMorphologyMatch(inflections)) {
morphologyMatch = doMorphologyMatch(infelction)
}
What do you think?
Yes, that's fine, although note that I think there are 2 sides to this -- sometimes it's the inflection table data that might define whether or not to do a morphology match, and sometimes it's the inflection that we are trying to match against.
The first scenario @monzug mentioned is to exclude the match in irregular columns, that is a property of the inflection table data.
The second scenario, is to exclude the match if the form we are matching against has Feature.type.comparative. So the function would need 2 arguments.
this fix is now merged in https://github.com/alpheios-project/webextension/tree/qa-2.0.3-7 and I've moved the remaining questions to a new issue #162
I think the last merge was for the highlighted cell and this looks good. everything else has been moved to issues #162 and #153
Per report from @monzug
Sometimes the parser is not accurate in the way it represents the suffix of a form, preventing matches into the inflection tables.
For example, for monitu, as a verbal supine, Whitaker reports the suffix as 'u' but the ending in the inflection tables is 'itu'.
Should we highlight the cell if all of the morphology matches? Should we do an extra check on the form and the endings in the cell to see if they match even if the suffix doesn't?
To discuss with @abrasax