globalwordnet / cili

The Global WordNet Association Collaborative Inter-Lingual Index
Other
40 stars 8 forks source link

ILI status annotations #8

Open goodmami opened 3 years ago

goodmami commented 3 years ago

There is the idea that an ILI can be proposed and deprecated/superseded, but I don't see where this status would be annotated in, e.g., ili.ttl. I don't know if the existing ontologies have some relevant property (xyz:status or something) or if we need to make something up, but we also need an inventory of possible statuses. Vossen, Bond, and McCrae 2016 describes actions taken on existing ILIs (deprecate, supersede, split, and fork), but not the current status of an ILI. How about the following:

I don't know if we'd need something like removed after deprecated with the distinction that deprecated ILIs may still be in use, but their continued use is discouraged, and removed ILIs are no longer recognized (maybe we clear the descriptions, but need to keep the IDs so they doesn't get recycled). But I think simpler is better, in general.

fcbond commented 3 years ago

Hi,

they can be deprecated or supeceded in ili.ttl

schema:supercededby ; owl:deprecated "true"^^xsd:boolean; We don't actually have an example, perhaps we should deprecate "church mouse" and supersede a "never" with the other one? On Tue, Feb 2, 2021 at 2:48 PM Michael Wayne Goodman < notifications@github.com> wrote: > There is the idea that an ILI can be proposed and deprecated/superseded, > but I don't see where this status would be annotated in, e.g., ili.ttl. I > don't know if the existing ontologies have some relevant property ( > xyz:status or something) or if we need to make something up, but we also > need an inventory of possible statuses. Vossen, Bond, and McCrae 2016 > describes actions taken on existing ILIs (deprecate, supersede, split, and > fork), but not the current status of an ILI. How about the following: > > - provisional (from something proposed via ili="i" in WN-LMF, given > some provisional identifier in CILI) > - active (accepted and in use; maybe this is the default, unannotated > value?) > - deprecated (sometimes accompanied by a separate superseded link to > something else) > > I don't know if we'd need something like removed after deprecated with > the distinction that deprecated ILIs may still be in use, but their > continued use is discouraged, and removed ILIs are no longer recognized > (maybe we clear the descriptions, but need to keep the IDs so they doesn't > get recycled). But I think simpler is better, in general. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > , or unsubscribe > > . > -- Francis Bond Division of Linguistics and Multilingual Studies Nanyang Technological University
goodmami commented 3 years ago

Thanks! So these vocabularies are defined in OWL and schema.org? For the latter, I saw the following:

The meta section contains terms primarily designed to support the implementation of the Schema.org vocabulary itself. It includes terms such as Class, Property, domainIncludes and supersededBy. They are not currently advocated for widespread use across the web.

So perhaps we should think about our own vocabulary for this one, or at least provide documentation explaining our use of the term, since it's not intended to be used outside of schema.org itself.

And those two terms don't cover the "provisional" status, so we still need something else.

jmccrae commented 3 years ago

Yes, could be a good idea to define our own vocabulary here.

goodmami commented 3 years ago

I'm not familiar with RDF conventions, but could this be a new relation ("predicate"?) with a controlled vocabulary of values ("objects"?), e.g.:

<i48540>    a   <Instance> ;
    skos:definition "a fictional mouse created by Lewis Carroll"@en ;
    dc:source   pwn30:02451912-n ;
    ili:status  ili:deprecated .

<i18263>    a   <Concept> ;
    skos:definition "not at all; certainly not; not in any circumstances"@en ;
    dc:source   pwn30:00020997-r ;
    ili:status  ili:deprecated ;
    ili:supersededBy    <i18262> .
goodmami commented 3 years ago

Is the ili namespace prefix used for ILI IDs themselves? That is, is i48540 more explicitly ili:i48540? If so, then using the ili namespace for these statuses doesn't seem appropriate.

jmccrae commented 3 years ago

There are some oddities in the Turtle file that probably need to be fixed with regards to the namespaces. Currently the namespace maps ili to http://globalwordnet.org/ili/ which I don't think is a URL that ever works. It also uses base namespace so we get the mapping of <i123> to http://globalwordnet.org/ili/ili.ttl#i123

None of these URLs actually work at the moment anyway (I have to contact Piek).

I will make a PR to fix the file to the normal URL schema.

goodmami commented 3 years ago

Thanks for the fixes in the PR!

I guess what I meant is that if each ILI ID is in the namespace as, e.g., ili:i48540, then the properties and values mentioned above (ili:status, ili:proposed, etc.) are sharing that namespace. I think a collision is unlikely, but they aren't really the same kind of thing. I'm not really sure what is standard practice in linked data or if that's a problem.

If it is a problem, then what if we had, e.g., http://globalwordnet.org/ili/ as the general namespace and http://globalwordnet.org/ili/concept/ as the one containing IDs? Maybe like this:

@prefix ili: <http://globalwordnet.org/ili/> .
@base <http://globalwordnet.org/ili/concept/> .

Or if it's not a problem, then I guess we need some rules and conventions for ILI identifiers. E.g., status, proposed, active, deprecated, and superseded are reserved; all identifiers must match the regex [A-Za-z][1-9][0-9]* so we have, e.g., i1 for CILI, g1 for GeoNames, and so on.

fcbond commented 1 week ago

Revisiting this, I think we said that split would be shown by an ili being deprecated and have multiple supersededBy links. I am wondering if also allowing a note field would be useful where we could give a brief reason. The nearest thing DC has is description, I am not sure if there is a better name we should use from somewhere else.