globalwordnet / english-wordnet

The Open English WordNet
https://en-word.net/
Other
441 stars 52 forks source link

If Acre is a territory all other 26 states of Brazil should also be #533

Closed arademaker closed 2 years ago

arademaker commented 3 years ago

(n) Acre a territory of western Brazil bordering on Bolivia and Peru is Instance of (1) (n) district, territory, territorial dominion, dominion a region marked off for administrative or other purposes

http://wn.mybluemix.net/synset?id=08552138-n

jmccrae commented 3 years ago

Probably, but I am also (personally*) not keen on expanding the number of proper nouns in WordNet. Both me and Francis published papers at the last GWC about using other resources such as Geonames or Wikidata to cover these use cases.

See also the discussions under #506 and #167

arademaker commented 3 years ago

Indeed, we all have already written something on these lines, right? So we all agree. But I am not suggesting adding a new proper noun. Actually, the error here is because until 1962, Acre was a territory but now it is a state of Brazil.

https://en.wikipedia.org/wiki/Acre_(state)

Acre was united in 1920. On June 15, 1962, it was elevated to the category of state and was the first to be governed by a woman, Iolanda Fleming, a teacher.

The obvious fix would be to remove the relation to territory, losing the historical fact that it WAS a territory in the past! This is an interesting issue. How to deal with such changes in WN?

Another possibility would be to remove Acre from the WN. But as I have also published in the past, http://arademaker.github.io/bibliography/lrec-2016-gentilics.html, we would lose the ability to link places to their https://en.wikipedia.org/wiki/Demonym.

@fcbond any opinion about these cases?

alexandretessarollo commented 3 years ago

Another example of this matter is Texas. For almost a decade it was an independent country and afterwards a state, but currently the English WordNet relations states it is a member of the Confederate States and a part of United States.

This issue, however, goes beyond the relations aspect. Definitions and even terms may have this characteristic of being true for a given time range only. For instance, working with geological time we have the example of Chibanian Age, a geological age between Calabrian and Upper Pleistocene Ages ranging from 0.774 to 0.129 millions of years ago (MYA) according to 2020 definition. 2019 definition named it Middle Pleistocene and stated it ranged form 0.773 to 0.126 MYA. Going back further, 2008 definition named it Ionian ranging from 0.781 to 0.126 MYA. Despite changing names and time boundaries, this concept always represented the geological age between Calabrian and Upper Pleistocene.

Choosing to use only the valid property comes with two major effects: we loose the historical information (Acre as a territory, Texas as a country or a member of the Confederate States, etc) and compromise ourselves with a major cost to keep everything up to date. On the other hand, keeping everything might induce errors in tools that rely on English WordNet (for instance, a simple Q&A tool might misinform that Texas is a state of the Confederate States).

vcvpaiva commented 3 years ago

I don't think either Acre or Texas is a big issue. Both are clearly states of the countries in question. Acre should not be considered a territory, as the synset for territory is about [territories under dispute]. Otherwise, all locations are territories too, at least in Romance languages where terra=place, ground.

just remove Acre from the territory synset and close the issue, as everyone knows that polical bounderies change! my 2 cents

vcvpaiva commented 3 years ago

and create the issue for the geological eras, because a million years one way or the other is not "small change".

alexandretessarollo commented 3 years ago

IMHO Acre, Texas and Chibanian Age share some common ground: they all have properties that were valid only during a certain period of time and/or are only valid from a given date on. Choosing to represent only the current one bears three relevant impacts:

Any thoughts?

arademaker commented 3 years ago

@vcvpaiva I didn't understand what makes some changes small and others big/relevant. But the point here is about how to deal with changes.. we can see the lexical resource as a mirror of the CURRENT reality, or we can try to incorporate temporal properties to relations or even to lexical forms... this is not a problem of WN only, all ontologies should have to decide how to deal with changes too, right? Unfortunately, among all the potential contributors to this project, too few opinions are presented here.

vcvpaiva commented 3 years ago

this is a small problem (IMO) because these situations have been settled for decades. it's not about political problems going on at the moment. That would've been much worse.

so NO: Acre is not a territory, but a state. Texas is not a country, but a state. WordNet needs a correction, but a minor one in this case.

Ontologies do have to decide how they deal with changes and maybe the geological periods are a serious issue for this, but then make the issue about geological eras.

arademaker commented 3 years ago

Sorry, but I disagree. Actually, this issue is really about the particular case of Acre with a remark about the fact that Acre is not the only case in WN that can have relations to be updated. As we have said before, we can:

1) remove the synset about Acre (losing the possibility to express things like the derivation between the concept of Acre and the demonyms acreano or acriano, see https://en.wikipedia.org/wiki/Acre_(state)).

2) update the relations of this synset to express the fact that now Acre is a state, and it is not a territory anymore.

3) create another synset for the Acre the state and keep the current synset for the Acre as a territory (that doesn't exist anymore).

Maybe other solutions are possible, like expand the WN meta-model to contains something like https://www.wikidata.org/wiki/Help:Qualifiers. So one can say during which period a statement is valid.