glossarist / concept-model

Glossarist Concept model
1 stars 0 forks source link

IEV improvements: add "masculine/feminine" grammatical gender #12

Open ronaldtse opened 4 years ago

ronaldtse commented 4 years ago

From IEC:

It is necessary to add the value "masculine/feminine" to cater for the decision taken by the Académie française, which adopted a report about the "féminisation des noms de métiers et de fonctions". The rapport is available here: http://www.academie-francaise.fr/sites/academie-francaise.fr/files/rapport_feminisation_noms_de_metier_et_de_fonction.pdf

EXAMPLES (from IEC)

internaute, m/f [IEV 732-09-05]

responsable de sécurité laser, m/f [IEC 60825-1:2014, 3.50]

opoudjis commented 4 years ago

Already addressed in Designation:

  +gender: DesignationGender[0..*]

In the general case, any noun with a gender may have more than one gender.

skalee commented 3 years ago

For the record, m/f gender is called common in our data model, see #20. And the relationship should be:

+gender: DesignationGender[0..1]
opoudjis commented 3 years ago

No, there are languages with nouns which can have multiple genders; happens a fair bit in Ancient Greek.

strogonoff commented 3 years ago

@opoudjis

Grammatical traits supported by Glossarist allow optional noun gender to be masculine, feminine, common or neuter. That seems to strike the best balance between consensus in the field, suitable flexibility across languages, while disallowing creating structures that make no sense in real world.

If there is a source about Greek having differing grammatical gender requirements, I am all ears.

Or perhaps we should leave this schema to control bodies?

ronaldtse commented 3 years ago

And @skalee will be familiar with all the grammar-related attributes in the IEV. I do not think we can come up with a single "international grammar model" because different bodies may want to have different types.

Regarding this particular issue, the exact concept is "m/f" (m + f), not "common"...

strogonoff commented 3 years ago

@ronaldtse, we have your and Nick’s claims here, versus (supposedly) the consensus captured by Wikipedia page on grammatical gender. (Now, of course anyone can edit Wikipedia, but presumably people who are maintaining that page had investigated this issue.) I believe the "m/f" shortcut is an artifact of people in charge of IEC not having the relevant linguistical background (which they are not really supposed to have), and in fact stands for common and neuter genders. There is no m/f concept as such.

strogonoff commented 3 years ago

If I am wrong in my interpretation of the source, I'd be happy to be corrected

skalee commented 3 years ago

Regarding this particular issue, the exact concept is "m/f" (m + f), not "common"...

As far as I understand after clarifications from @strogonoff in #20, linguists typically distinguish four genders: masculine, feminine, neuter, and common. This is enough to describe basic grammatical differences between nouns, adjectives, etc. in most languages. Common gender describes situation when language evolution made masculine/feminine grammatical difference disappeared for some group of words. To me it looks exactly like what happened to genders in Dutch.

Still, we need to display it as m/f for Dutch terms on IEV pages, or whatever is most appropriate in local language. And I'm pretty sure we'll have more local display rules, e.g. feminine in Polish as ż, feminine in Serbian as ж, plurality annotation always present in Serbian.

Also, I'm pretty sure that at some point we'll have to extend our models with more grammatical contrasts or gender subclasses, for example animate vs inanimate (https://en.wikipedia.org/wiki/Grammatical_gender#Slavic_languages).

ronaldtse commented 3 years ago

@strogonoff it would be dangerous to second-guess IEC's decision:

I believe the "m/f" shortcut is an artifact of people in charge of IEC not having the relevant linguistical background (which they are not really supposed to have), and in fact stands for common and neuter genders. There is no m/f concept as such.

This decision was adopted by the IEC Central Office, proposed by IEC/TC 1 ("Terminology") and IEC's Terminology Manager, who are all French literate. I would consider them closer to linguists than most of us, except @opoudjis who has a linguistics PhD :wink:

skalee commented 3 years ago

For reference, all terms in IEV spreadsheets which have m/f or f/m gender:

select ievref, language, term, termattribute
from concepts
where termattribute like '%m/f%' or termattribute like '%f/m%';

IEVREF      LANGUAGE    TERM        TERMATTRIBUTE
----------  ----------  ----------  --------------------------
102-03-23   nl BE       norm        <van een vector> m/f
102-04-22   nl BE       abscis      m/f
102-04-30   nl BE       radiaal     m/f
102-04-47   nl BE       steradiaal  m/f
102-06-10   nl BE       orde        <van een vierkante matr
102-06-22   nl BE       norm van e  m/f
732-09-05   fr          internaute  m/f

I've made similar query for synonyms too, but it returned the empty result set.

skalee commented 3 years ago

@strogonoff it would be dangerous to second-guess IEC's decision:

I believe the "m/f" shortcut is an artifact of people in charge of IEC not having the relevant linguistical background (which they are not really supposed to have), and in fact stands for common and neuter genders. There is no m/f concept as such.

This decision was adopted by the IEC Central Office, proposed by IEC/TC 1 ("Terminology") and IEC's Terminology Manager, who are all French literate. I would consider them closer to linguists than most of us, except @opoudjis who has a linguistics PhD 😉

Wiktionary in Dutch also annotates some nouns with m/f gender (actually v/m, in which v stands for feminine), for example https://nl.wiktionary.org/wiki/abscis or https://nl.wiktionary.org/wiki/radiaal.

That said, the question is not about presentation, but about our data model. And we have two options: [masculine, feminine] array or common single item, in my understanding both meaning exactly the same. Unless other gender mixes are possible (e.g. m+n) and unless gender order matters (m+f vs f+m) I support the idea of representing them as common, because it's simpler.

strogonoff commented 3 years ago

Display does not affect the schema, so we can show common as "m/f" or whatever localized version iff we confirm that it is (1) indeed the same and that (2) it should be shown as m/f (which I doubt a little).

On 21 Jan 2021, at 8:06 PM, Sebastian Skałacki notifications@github.com wrote:

 For reference, all terms in IEV spreadsheets which have m/f or f/m gender:

select ievref, language, term, termattribute from concepts where termattribute like '%m/f%' or termattribute like '%f/m%';

IEVREF LANGUAGE TERM TERMATTRIBUTE


102-03-23 nl BE norm <van een vector> m/f 102-04-22 nl BE abscis m/f 102-04-30 nl BE radiaal m/f 102-04-47 nl BE steradiaal m/f 102-06-10 nl BE orde <van een vierkante matr 102-06-22 nl BE norm van e m/f 732-09-05 fr internaute m/f I've made similar query for synonyms too, but it returned the empty result set.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

skalee commented 3 years ago

it should be shown as m/f (which I doubt a little).

@strogonoff Many dictionaries use m/f or f/m to denote what you call common gender, including Wiktionary. I'm pretty sure this is how common gender is supposed to be displayed in IEV too. Or maybe as v/m which is the same but in Dutch.

So-called common gender describes situation when given word is either f or m, but in modern grammar it doesn't really matter because there is no difference in declension for both these genders.

opoudjis commented 3 years ago

... I don't know that I want to get into this, but: there are languages that have a notion of a common or epicene gender, and there are also languages in which a noun can belong to two random genders without a notion of a common gender. Rather than formulate a universal vocabulary, the sensible thing is to respect the language's grammatical tradition. if a language uses "common" as a term of art, use "common"; if a language does not, or else uses other combinations, like say "m+n", use that.

In the completely general case, it is possible for a language to have a noun that belongs to two genders (or noun classes: we should not be ruling out e.g. Swahili from coverage), which do not happen to be masculine and feminine. Virus in German for example can be both masculine and neuter. Apparently the neuter is slowly dying out for Virus: https://german.stackexchange.com/questions/2735/what-gender-does-a-virus-have?noredirect=1&lq=1 — but that's not the point: for a model, we need to cater to all possibilities.

So a noun can have multiple genders in the general case, even if we do deal with the most usual such case, m/f, as "common". And the glossarist model should not be restricted to the languages or the linguistic decisions of ISO: it should be as general as is reasonable.

strogonoff commented 3 years ago

Thanks for the background, that's useful information.

However, instead of creating a super generic scheme, I think it is better to allow control bodies to supply their own schemas. Genericness doesn't come without downsides.

strogonoff commented 3 years ago

Otherwise, we must really be top-notch linguists, and maintain a model that follows the latest developments in the field, and it all becomes quickly very unfeasible.

ronaldtse commented 3 years ago

And the glossarist model should not be restricted to the languages or the linguistic decisions of ISO: it should be as general as is reasonable.

Agree. May be useful to keep the model and the UI somewhat separate though.

I think it is better to allow control bodies to supply their own schemas.

Agree, in the cases where the control bodies want to supply that.

ronaldtse commented 3 years ago

I have an update from IEC:

Is m/f identical to something like German's neuter?

Not in my understanding. I was told “The masculine and feminine form of the role of a person are not really synonyms, but the proposed presentation allows an easy search on both forms.”

masculine and feminine terms are NOT synonyms, the presentation form just allows for easier searching for either term.

Or does m/f mean “masculine + feminine”, i.e. it is both m and f at the same time, but it is not n ?

“or”

m/f = masculine or feminine => not a "common gender"

If the French term is m/f, is “m/f” the proper way to display it?

was ask[ed] to implement “m/f” in the Electropedia [...] simply a convention.

So it is "m/f".

Reopening this to update the model. (if it's already good, we can close it).

ping @strogonoff @skalee @opoudjis

strogonoff commented 3 years ago

Common gender is not ‘so-called’. However, “m/f” is.

Here is a screenshot of an entry from Spanish dictionary. Note that it uses “common” to denote gender:

This word can be used in a sentence as either masculine or feminine. The word does not have a fixed grammatical property of being masculine or feminine: regardless of whether the rest of the sentence expects a masculine or a feminine noun, this word will work, and the sentence will sound natural. The word is sort of shape-shifter. That is why common gender is a separate concept.

On the other hand, epicene gender is not a grammatical property of a noun. Epicene gender has to do with what the word means, and that is captured by definition. Grammatical properties, on the other hand, define how designation “agrees” with other words in a sentence, and epicene gender has nothing to do with that.

Definition = knowing what the word refers to. Grammatical properties = how to use a word in a sentence without it sounding incorrect/unnatural in given language. Epicene gender is not the latter. A given epicene gender noun still has the property of belonging to a grammatical gender, such as masculine or feminine.

On 23 Jan 2021, at 6:58 AM, Sebastian Skałacki notifications@github.com wrote:

 it should be shown as m/f (which I doubt a little).

@strogonoff Many dictionaries use m/f or f/m to denote what you call common gender. I'm pretty sure it should be displayed this way, or even v/m which is the same but in Dutch.

So-called common gender describes situation when given word is either f or m, but in modern grammar it doesn't really matter because there is no difference in declension for both these genders.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

strogonoff commented 3 years ago

Sorry, GitHub mishandled image attachment.

Screenshot 2021-01-29 at 5 54 43 PM

This and the Wikipedia article on grammatical gender concur on common gender.

Is there an example or a source that confirms “m/f” is a thing outside of legacy data we’re working with? Is there a clear definition that describes what this abbreviation stands for?

ronaldtse commented 3 years ago

Is there an example or a source that confirms “m/f” is a thing outside of legacy data we’re working with? Is there a clear definition that describes what this abbreviation stands for?

These are real examples requested by IEC:

internaute, m/f [IEV 732-09-05] responsable de sécurité laser, m/f [IEC 60825-1:2014, 3.50]

The IEV today does not contain m/f because their software can't handle it. That's why they have requested this to be added to Glossarist -- they want to use m/f.

Anyway, the IEC clarification is clear, they consider m/f to be "m or f" and not a "common gender". How we model it internally is of course another story. I think an array holding m and f is a reasonable choice.

strogonoff commented 3 years ago

By “example” I meant an authoritative example from somewhere else more reliable. We know the data and its schema that we’re working with here are not very precise and contain mistakes. I imagine we want to avoid simply “cargo culting” it and crafting our schema to perpetuate imprecise definitions.

Imagine we have a tooltip that precisely explains to a reader what “m/f” means. What goes there? Where will we link the user for authoritative explanation? I have this bit of information for “common”, since this is featured in other dictionaries, but I don’t have this for “m/f”.

strogonoff commented 3 years ago

Here are two examples of how French dictionaries describe words that are “masculine or feminine”:

Screenshot 2021-01-29 at 6 42 52 PM Screenshot 2021-01-29 at 6 42 51 PM

Note the “n. m. sing. et n. f. pl.”

strogonoff commented 3 years ago

Current status:

  1. Some languages (such as Spanish and Russian) have a specifically defined common gender, used to denote nouns that can be used in a sentence equally well regardless of whether the sentence expects a masculine or a feminine noun in that place.

  2. Other languages (like French) use different ways of describing how a word behaves across genders from grammatical standpoint, and it is neither “common gender” nor a list of genders.

  3. We can continue using “common” in our model and special-case output it as “masculine or feminine” in French only, but that would not cover more complex cases like the above.

  4. Otherwise, the most feasible way that I see of representing this right now without resorting to per-language schemas is with a freeform string (which can to a degree be validated automatically with programmatically defined constraints, but ultimately validated manually by customer organization’s linguist).

  5. Epicene gender doesn’t seem to enter the picture here, not pertaining to grammatical usage of the word as much as its meaning.

ronaldtse commented 3 years ago

I have requested assistance from a French linguist. Let's see what he says.