unicode-org / inflection

code, data and documentation related to handling inflection problems
Other
0 stars 1 forks source link

What is our API surface? #3

Open nciric opened 4 months ago

nciric commented 4 months ago

We started discussing use cases and potential lexicon format in #1 . Let's move details of API design to a separate issue (this one).

From my end I see the following use cases:

  1. Inflecting a single word in a message format, from base form with provided grammatical information, e.g. icu.inflect("sr-Latn", "Beograd", options { "vocative", "singular" }) -> "Beograde". Necessary grammatical information for "Beograd", like gender, inanimate etc, would be pulled from a lexicon.
  2. For a given word, find its lemma and grammatical info.
  3. For a word not in a lexicon, try to "guess" its inflected form, based on rules and/or similarity to other words in the dictionary.
  4. Optional - try to align multiple related words, e.g. inflect adjectives and corresponding noun to form grammatically sound whole - big red apple. In case of English reorder the adjectives?