ProxPxD / Langcode

MIT License
0 stars 0 forks source link

Possibility to make abbreviations #1

Open ignazvolkov93 opened 6 months ago

ignazvolkov93 commented 6 months ago

Think out the possibility to make abbreviations inside a morpheme ('cause not all abbreviations are formed in the same way even in a same langue, and not all words have abbreviations).

The abbreviation feature can be declared with multiple options in order to figure out how to make the abbreviation.

The feature cases (multiple values) indicates in which grammar cases the abbreviation can be inflected. If this one is not set, then it will make a simple abbreviation.

The feature vowels (boolean) indicates wether or not the abbreviation must avoid vowels.

The feature consonants (boolean) indicates wether or not the abbreviation must avoid consonantes.

Both features vowels and consonants cannot be together 'cause they will get in conflict, in that case, it is advices to use the feature skip or extract.

The feature skip (multiple strings or numbers) is used to indicate which letters to avoid from the morpheme. The values being numbers, indicate which position avoid (negative numbers are used for upside down positions). The values being strings, indicate which letters to avoid, no matter the position they are.

The feature extract (multiple strings or numbers) is used to indicate which letters stay from the morpheme. The values being numbers, indicate which letters in a certain position stay (negative numbers are used for upside down positions). The values being strings, indicate which letters stay, no matter the position they are.

Both features skip and extract cannot be together 'cause they will get in conflict.

The feature max_syllable_chars (number) indicates how many letters to grab from each morpheme's syllable.

The feature final_dot (boolean) indicates if it is necessary to place a dot (.) at the end of the abbreviation or not.

The feature punctuation_mark (string) indicates which mark to used for the abbreviation.

As a metter of fact, it is allowed to specific a custom abbreviation without writing the features, but it must be indicated right in the abbr feature or under cases.


E.g.:

Let's take a look to the word spéizefúnkt which can be used as a verb or a noun but it's abbreviated only when being used as noun in the nominative case. It also has 3 syllables (spéi, ze and fúnkt).

speizefunkt:
    form: spéizefunkt
    abbr:
        cases:
            - nominative
        vowels: false
        skip:
            - n
        max_syllable_chars: 3
        final_dot: true

This would return something like...

spéizefunkt:
    abbr:
        singular: spzfkt.
        plural: spzfktr.

Or maybe the word בדיקה which is a feminine noun and cannot be abbreviated by syllables but skipping certain letters from the morpheme.

bdikah:
    form: בדיקה
    abbr:
        skip:
            - 1
            - 2
        punctuation_mark: ״

So it'll return...

:בדיקה
    abbr: ב״קה

And now, the pronoun usted from spanish.

usted:
    form: usted
    abbr:
        cases:
            nominative:
                singular: ud.
                plural: uds.

This one will return...

usted:
    abbr:
        singular: ud.
        plural: uds.
ProxPxD commented 6 months ago

Good idea for a much further state. It would be good to specify the rules for the abbreviations like

The same may be with cases as in Polish: dr dr-a dr-owi