jmccrae / gwn-scala-api

API for working with GWN formats
Apache License 2.0
10 stars 0 forks source link

WordNet Converter synset relations #22

Closed rwingerter55 closed 3 years ago

rwingerter55 commented 3 years ago

I have used the WordNet converter (http://server1.nlp.insight-centre.org/gwn-converter/#) to convert OdeNet v.1.4, (https://github.com/hdaSprachtechnologie/odenet) to RDF/XML. After importing the RDF data into VocBench3 (http://vocbench.uniroma2.it/doc/), the VocBench3 user interface did not display any synset relations.

As a test, I also converted Open English WordNet (2021 Release Candidate) from GWA XML to RDF/XML, and experienced the same problem after importing the data into VocBench. Here is an example.

Importing RDF: http://john.mccr.ae/oewn2021/english-wordnet-2021.ttl.gz into VocBench I get

oewnid:ewn-07539144-n a ontolex:LexicalConcept;
  dct:subject "noun.feeling";
  skos:inScheme <https://en-word.net/>;
  wn:definition _:node1fes0utk9x240856;
  wn:hypernym oewnid:ewn-07495208-n;
  wn:hyponym oewnid:ewn-07523471-n, oewnid:ewn-07539481-n, oewnid:ewn-07539768-n, oewnid:ewn-07539999-n,
    oewnid:ewn-07540157-n, oewnid:ewn-07540296-n, oewnid:ewn-07540606-n, oewnid:ewn-07540794-n,
    oewnid:ewn-07540999-n, oewnid:ewn-07541241-n;
  wn:ili ili:i76293;
  wn:partOfSpeech wn:noun .

Whereas converting GWA XML: http://john.mccr.ae/oewn2021/english-wordnet-2021.xml to RDF and importing it into VocBench I don't get any synset relations:

:ewn-07539144-n a ontolex:LexicalConcept;
  dc:subject "noun.feeling";
  skos:inScheme :ewn;
  wn:definition _:node1fgot68qcx631819;
  wn:ili ili:i76293;
  wn:partOfSpeech wn:noun .
rwingerter55 commented 3 years ago

Here is a snippet from the RDF created by the converter:

  <rdf:Description rdf:nodeID="A631732">
    <vartrans:source rdf:resource="https://en-word.net/ewn-07539144-n"/>
    <vartrans:target rdf:resource="https://en-word.net/ewn-07495208-n"/>
    <vartrans:category rdf:resource="https://globalwordnet.github.io/schemas/wn#hypernym"/>
    <rdf:type rdf:resource="http://www.w3.org/ns/lemon/vartrans#SynsetRelation"/>
  </rdf:Description>

Maybe vartrans:SynsetRelation should be vartrans:SenseRelation?

jmccrae commented 3 years ago

It should be ConceptualRelation... the diagram in the specification has an error.

jmccrae commented 3 years ago

The schema specifies that we should use the reified form of the relations when exporting to RDF

https://globalwordnet.github.io/schemas/

However, the official export I use for OEWN uses direct relations... which is easier to work with but can't capture all of the detail that we can in the XML version (e.g., sources for the relations).

Probably the best solution is to support an abbreviated form of export also?

rwingerter55 commented 3 years ago

I think so. AFAICS it looks like VocBench does not support the reified form of the relations. FWIW, below is a snippet from pwn-concepts.rdf, which is part of a OMW distribution (cf. http://vocbench.uniroma2.it/doc/user/test_drive.jsf#creating_a_ontolex_project_for_managing_a_large_lexicon_by_connecting_to_an_external_triple_store). Note that the RDF file has both wn:hypernym/wn:hyponym and skos:broader/narrower relations.

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:ontolex="http://www.w3.org/ns/lemon/ontolex#"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#"
    xmlns:wn="http://globalwordnet.github.io/schemas/wn#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:vartrans="http://www.w3.org/ns/lemon/vartrans#">

<rdf:Description rdf:about="http://art.uniroma2.it/pmki/omw/pwn30-conceptset">
    <rdf:type rdf:resource="http://www.w3.org/ns/lemon/ontolex#ConceptSet"/>
    <title xmlns="http://purl.org/dc/terms/" xml:lang="en">Princeton WordNet 3.0 Concept Set</title>
</rdf:Description>

<rdf:Description rdf:about="http://art.uniroma2.it/pmki/omw/00001740-n">
    <rdf:type rdf:resource="http://www.w3.org/ns/lemon/ontolex#LexicalConcept"/>
    <wn:partOfSpeech rdf:resource="http://globalwordnet.github.io/schemas/wn#noun"/>
    <skos:definition rdf:nodeID="node1cjlldffux1"/>
</rdf:Description>

<rdf:Description rdf:nodeID="node1cjlldffux1">
    <rdf:value xml:lang="en">that which is perceived or known or inferred to have its own distinct existence (living or nonliving)</rdf:value>
</rdf:Description>

<rdf:Description rdf:about="http://art.uniroma2.it/pmki/omw/00001740-n">
    <sameAs xmlns="http://www.w3.org/2002/07/owl#" rdf:resource="http://ili.globalwordnet.org/ili/i35545"/>
</rdf:Description>

<rdf:Description rdf:nodeID="node1cjlldffux2">
    <rdf:type rdf:resource="http://www.w3.org/ns/lemon/vartrans#ConceptualRelation"/>
    <vartrans:category rdf:resource="http://globalwordnet.github.io/schemas/wn#hyponym"/>
    <vartrans:source rdf:resource="http://art.uniroma2.it/pmki/omw/00001740-n"/>
    <vartrans:target rdf:resource="http://art.uniroma2.it/pmki/omw/00001930-n"/>
</rdf:Description>

<rdf:Description rdf:about="http://art.uniroma2.it/pmki/omw/00001740-n">
    <skos:narrower rdf:resource="http://art.uniroma2.it/pmki/omw/00001930-n"/>
</rdf:Description>

<rdf:Description rdf:nodeID="node1cjlldffux3">
    <rdf:type rdf:resource="http://www.w3.org/ns/lemon/vartrans#ConceptualRelation"/>
    <vartrans:category rdf:resource="http://globalwordnet.github.io/schemas/wn#hyponym"/>
    <vartrans:source rdf:resource="http://art.uniroma2.it/pmki/omw/00001740-n"/>
    <vartrans:target rdf:resource="http://art.uniroma2.it/pmki/omw/00002137-n"/>
</rdf:Description>

<rdf:Description rdf:about="http://art.uniroma2.it/pmki/omw/00001740-n">
    <skos:narrower rdf:resource="http://art.uniroma2.it/pmki/omw/00002137-n"/>
</rdf:Description>
jmccrae commented 3 years ago

I pushed changes to implement this.

At the moment the deployed version of this is down due to an ongoing cyber-attack: https://www.independent.ie/irish-news/education/nui-galway-subject-of-attempted-cyber-attack-40904051.html

rwingerter55 commented 3 years ago

Thank you and good luck.