glossarist / iev-data

1 stars 1 forks source link

How we represent qualifiers in data model? #94

Open skalee opened 3 years ago

skalee commented 3 years ago

From https://en.wiktionary.org/wiki/qualifier#English:

(grammar) A word or phrase, such as an adjective or adverb, that describes or characterizes another word or phrase, such as a noun or verb; a modifier; that adds or subtracts attributes to another.

Can we consider it as another part of speech (although in fact it's not)? Or maybe we need a separate field for that?

Qualifier examples:

Here is how they are displayed in Electropedia:

Zrzut ekranu 2021-01-12 o 19 38 35

Feel free to move this issue to glossarist/concept-model if you find it more appropriate.

skalee commented 3 years ago

@ronaldtse ?

skalee commented 3 years ago

Also please note that there are concepts, for example in 713-10-41, in which some translation are marked as "qualifiers" whereas some other are regular parts of speech, like "adjective". Correct or anomaly?

Zrzut ekranu 2021-01-14 o 11 16 17

ronaldtse commented 3 years ago

Sought clarification from IEC.

skalee commented 3 years ago

I think this is more about our internal data structures rather than about presentation layer.

ronaldtse commented 3 years ago

The internal data structure also depends on intention, so let's wait for IEC confirmation. It might take some time since some are on holiday.

ronaldtse commented 3 years ago

From IEC:

This is yet another legacy problem.

The rules were changed to say image image

but TC 1 has yet to address all the legacy data.

Perhaps we have to mark this as legacy somehow. @skalee thoughts?

ronaldtse commented 3 years ago

To IEC:

In this case how should we deal with this to-be-retired attribute?

Perhaps in the model we keep it as a “legacy” free-text attribute, and in the user interface we show it with some warning (so people don’t enter this field in unless absolutely necessary?

ping @strogonoff for thoughts too.

strogonoff commented 3 years ago

@ronaldtse

Background (which you may know): “qualifier” just means it’s a construct that describes something (a thing or an action) and that can be removed from the sentence without messing up the grammar. These are some simple cases:

Qualifiers are often adjectives or adverbs, but other parts of speech could be used as qualifiers, especially if we are considering multi-word expressions (and vice-versa, sometimes an adjective does not act as a qualifier in a given sentence).

In my view, the “qualifier” flag isn’t entirely useless. If I didn’t know it’s considered legacy, I’d be open to adding “qualifier” (or “modifier”, maybe even “premodifier” and “postmodifier”) marker in our concept model.

My considerations in favor of the qualifier flag:

Considerations against:

strogonoff commented 3 years ago

If we are definitely retiring the qualifier, then we could do it the following way:

1) Where “qualifier” is given instead of part of speech, omit partToSpeech entirely (it’s optional). 2) Add “[qualifier]” as free text somewhere in definition (can be at the end or in the beginning). This way, the upcoming full-text search feature could be used to filter all legacy uses of “qualifier” for a batch update.

ronaldtse commented 3 years ago

@strogonoff IEC is going to remove all "qualifier" attributes sometime in the future, because the new rules no longer allow it.

So keeping "qualifier" as an extra attribute in the model won't be useful in the future; we don't have anyone to use it, and the maintenance burden would be on us.

  1. We will need to keep the "qualifier" for now because we need to faithfully reproduce IEC data.
  2. We cannot "add" text to the definition because these data items are already standardized.

Could we handle this in some way that also allows handling other legacy attributes?

strogonoff commented 3 years ago

I see.

We should probably switch to customizable schema for this, until then we have a couple options:

Those both apply to the concept as a whole, but technically the “qualifier” trait is one that should apply to the concept as opposed to particular designation🤔

skalee commented 3 years ago

Let's remove it now.

From data analysis, usage of this attribute is very inconsistent. For example, these concepts have "qualifier" in term name, not in term attributes:

IEV ref language designation term attributes
151-15-01 pl AC (kwalifikator)
131-14-23 pl rozwarcia, kwalifikator
151-15-13 pt de entrada (qualificativo)
161-05-01 it ISM (qualificativo)

And there are many, many more like these. I presume some other concepts have their definitions reworked to clearly indicate that it's qualifier.

We cannot rely on presence of that attribute.

I suppose the best we can do is to auto-replace "qualifier" termattribute with "(qualifier)" in designation or in front of definition. And perhaps to add some entry in comments.


FYI List of terms which have "qualifier" attribute or similar. May be incomplete if some localized term doesn't match my query or does not use latin script.

select ievref, language, term, termattribute from concepts where termattribute like '%qualif%';

produces:

131-11-10 fr à paramètres répartis qualificatif
131-11-17 fr indépendant du temps qualificatif
131-11-19 fr non linéaire qualificatif
131-11-36 fr non dissipatif qualificatif
131-14-20 fr en court-circuit (1), qualificatif
131-14-21 en short-circuit qualifier
131-14-21 fr en court-circuit (2), qualificatif
131-14-22 fr en circuit ouvert (1), qualificatif
131-14-23 en open-circuit qualifier
131-14-23 fr en circuit ouvert (2), qualificatif
151-11-05 fr d’électricité qualificatif
151-11-05 pt de electricidade (qualificativo)
151-15-01 en AC qualifier
151-15-01 fr AC qualificatif
151-15-02 en DC qualifier
151-15-02 fr DC qualificatif
151-15-13 fr d'entrée qualificatif
151-15-14 fr de sortie qualificatif
151-15-20 fr en charge qualificatif
151-15-21 fr hors charge qualificatif
151-15-58 fr sous tension qualificatif
151-15-59 fr hors tension qualificatif
151-16-05 fr d'extérieur qualificatif
151-16-06 fr d'intérieur qualificatif
161-05-01 en ISM (qualifier)
702-06-29 en vestigial sideband (qualifying term)
713-10-41 en preselected-channel ... qualifier
826-14-15 en inherently short-circuit and earth fault proof qualifier
826-14-15 fr intrinsèquement protégé contre les court-circuits et les défauts à la terre qualificatif
strogonoff commented 3 years ago

If we can remove it I don’t have objections. If we need to keep it for faithful data representation then I suggest to use an option from my previous comment (if it’s not a problem that “qualifier” will be applied to the whole concept and not to particular designation)

ronaldtse commented 3 years ago

@skalee we can't "remove" this attribute now, we must retain this because it is part of the standardized content.

From data analysis, usage of this attribute is very inconsistent. For example, these concepts have "qualifier" in term name, not in term attributes

Qualifier is clearly an attribute, so we should extract that from the term/designation.

And perhaps to add some entry in comments.

Maybe, or not. IEV manager says that the "qualifier" attribute (today) will transition to a sentence in "Notes" (in the future).

@strogonoff :

(if it’s not a problem that “qualifier” will be applied to the whole concept and not to particular designation)

In https://github.com/glossarist/iev-data/issues/94#issuecomment-760103051 @skalee provided an example where the "qualifier" attribute only applies to English.

@skalee are there any other instances that the "qualifier" attribute applies to a designation instead of a concept? If this is the only one I will ask IEC whether this particular one can be adjusted/fixed.

strogonoff commented 3 years ago

My reasoning suggests that, interestingly enough, being a qualifier is a property of the concept, not a particular designation 🤔

Unless we can come up with an example of a single concept that has some designations that are qualifiers and others that are not (I couldn’t think of one).

(Whereas part of speech, abbreviation marker, etc. clearly belong to individual designations of that concept.)

On 23 Jan 2021, at 2:34 AM, Ronald Tse notifications@github.com wrote:

 @skalee we can't "remove" this attribute now, we must retain this because it is part of the standardized content.

From data analysis, usage of this attribute is very inconsistent. For example, these concepts have "qualifier" in term name, not in term attributes

Qualifier is clearly an attribute, so we should extract that from the term/designation.

And perhaps to add some entry in comments.

Maybe, or not. IEV manager says that the "qualifier" attribute (today) will transition to a sentence in "Notes" (in the future).

@strogonoff :

(if it’s not a problem that “qualifier” will be applied to the whole concept and not to particular designation)

In #94 (comment) @skalee provided an example where the "qualifier" attribute only applies to English.

@skalee are there any other instances that the "qualifier" attribute applies to a designation instead of a concept? If this is the only one I will ask IEC whether this particular one can be adjusted/fixed.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

strogonoff commented 3 years ago

(And if we move the qualifier marker from designation to the concept, that would be a lossless transformation—I’m willing to be corrected here.)

strogonoff commented 3 years ago

My reasoning suggests that, interestingly enough, being a qualifier is a property of the concept, not a particular designation 🤔

I believe this logic is also supported by the document in @ronaldtse’s screenshot, which moves this aspect to the definition—which concerns the concept as a whole.

ronaldtse commented 3 years ago

I have asked IEC for confirmation on whether this attribute applies to the concept or designation.

Our original question:

Perhaps in the model we keep it as a “legacy” free-text attribute, and in the user interface we show it with some warning (so people don’t enter this field in unless absolutely necessary?

The IEV terminology manager has this to say regarding legacy attributes:

from a general perspective, there will probably always be legacy issues since the standardization process is not immediate and so the proposal looks a good pragmatic approach from my perspective.

skalee commented 3 years ago

@ronaldtse:

@skalee we can't "remove" this attribute now, we must retain this because it is part of the standardized content. Qualifier is clearly an attribute, so we should extract that from the term/designation.

To me it looks like they could be in the middle of transition from attribute to something else. Maybe we should ask for clarification.

@strogonoff:

My reasoning suggests that, interestingly enough, being a qualifier is a property of the concept, not a particular designation 🤔

I have similar feeling.

By definition, term attributes are relevant to designation. They want to turn them into notes, what suggests that qualifiers are actually meant to be relevant to concepts, or at least localized concepts. This suggests misuse of term attributes.

@ronaldtse:

@skalee are there any other instances that the "qualifier" attribute applies to a designation instead of a concept? If this is the only one I will ask IEC whether this particular one can be adjusted/fixed.

I'll investigate, though I'm not sure I'll find out anything — there is so much inconsistency.

ronaldtse commented 3 years ago

To me it looks like they could be in the middle of transition from attribute to something else. Maybe we should ask for clarification.

They already said that they will be doing the transition but it might take years to revise these terms.

I'll investigate, though I'm not sure I'll find out anything — there is so much inconsistency.

Thanks, please post if you find any inconsistency so that the IEV team can fix them (if editorial) or report them to the term owner for revision.

skalee commented 3 years ago

@ronaldtse This is really inconsistent… https://gist.github.com/skalee/e281031db37a1a08941f04ec1a7721af And I'm pretty sure I'll find even more when I improve the SQL query.

"AC" term is quite interesting — qualifier in English or Polish whereas abbreviation in Portuguese or Spanish:

151-15-01        ar       تيار متردد (متناوب)
151-15-01        de       Wechselstrom…                                                 in Zusammensetzungen
151-15-01        en       AC                                                            qualifier
151-15-01        es       AC, calificativo
151-15-01        fi       AC
151-15-01        fr       AC                                                            qualificatif
151-15-01        it       AC; c.a.; corrente alternata
151-15-01        ja       交流
151-15-01        ko       교류                                                            수식어
151-15-01        pl       AC (kwalifikator)
151-15-01        pt       CA (abreviatura)
151-15-01        ru       АС, обозначение
151-15-01        sr       AC                                                            квалификатор
151-15-01        sv       vs-
151-15-01        zh       交流
skalee commented 3 years ago

@ronaldtse Got a question — are these IEV spreadsheets exports from some tool or they work on them directly? I'm asking because perhaps some discrepancies like "qualifier" being either in term column or in attribs column could be coming from export tool issue.

ronaldtse commented 3 years ago

@skalee these exports are exported from Lotus Domino. It reflects the actual data.

ronaldtse commented 3 years ago

Thanks for bringing this up, I will create a new issue on 151-15-01.

ronaldtse commented 3 years ago

From IEC:

Strictly speaking the “qualifier" attribute applies at the concept level, but is inherited by the terms that are designated with a part of speech “adj” according to the revised rules. I don’t remember whether TBX allows this at the concept level, and I don’t have the time to look at present.

With time, the rule I provided the screen shots for below will be applied and thus the “qualifier” term attribute will disappear and be replaced by “adj”.

FYI, as shown in the example in the IEC Supplement, SK.3.1.3.6.2 Grammatical information, the same term can represent a noun and an adjective:

image001

tl;dr:

@skalee @strogonoff we will need to adjust the concept model to accommodate these. Thanks!

ronaldtse commented 3 years ago

Re: term can be both "noun" and "adj", I will seek clarification since the definitions can differ, and therefore the concept should also.

NOTE: This does not affect our current work, so this issue should proceed. Thanks.

strogonoff commented 3 years ago

Requiring qualifiers to be strictly adjectives doesn’t make sense from linguistic point of view, since they aren’t always adjectives. A given vocabulary can impose that constraint, but I don’t think it generalizes across all vocabularies.

Furthermore, a designation can be (and often is) a multi-word expression, where each word belongs to its own part of speech.

That’s why part of speech is an optional property of a designation in Glossarist’s model. A designation is either a qualifier or not, but it may not have a single clear part of speech, especially if it is a phrase consisting of multiple words.

On 29 Jan 2021, at 11:53 AM, Ronald Tse notifications@github.com wrote:

"qualifier" means each of its terms are considered to have a parts of speech "adj" according to the IEC revised rules (IEC Directives, Annex SK 3.1.3.6.2).