own-pt / openWordnet-PT

OpenWordnet-PT: an open access wordnet for Portuguese
http://openwordnet-pt.org
Other
154 stars 35 forks source link

AdjectiveSynsets sameAs AdejctiveSateliteSynsets #180

Closed fredsonaguiar closed 3 years ago

fredsonaguiar commented 3 years ago

In OWN-PT, we have no instances of AdjectiveSatelliteSynsets, even for those synsets wich are sameAs some AdjectiveSateliteSynsets. In these cases they appear as AdjectiveSynsets.

For instance,

<https://w3id.org/own-pt/wn30-en/instances/synset-02598110-s> owl:sameAs <https://w3id.org/own-pt/wn30-pt/instances/synset-02598110-a>

Here, synset-02598110-s is AdjectiveSatelliteSynsets, but synset-02598110-a is AdjectiveSynsets.

arademaker commented 3 years ago

We need to fix that. So far, OWN-PT is not expected to introduce ANY structural change in the OWN-EN network. Of course, the OWN-EN itself is, so far, just an RDF version of PWN 3.0.

fredsonaguiar commented 3 years ago

Synce we have synset relations from OWN-EN projected in OWN-PT, the natural solution is to change the types in those cases. In fact, those synsets act topologically as satellites. I'm doing so.

arademaker commented 3 years ago

Yes, that is what I said. The type and the URI os the synsets in OWN-PT should follow the structure of OWN-EN/PWN30

fredsonaguiar commented 3 years ago

In 750d92ea7f8f4ed559977c52e9c8cdda6e81765c we fix that problem, through adjective_satelites_sameas, responsible for reviewing those types, as discussed before. The running and outputs:

python3 pyownpt/cli/adjective_satelites_sameas.py own-pt-synsets.ttl openWordnet-PT/own-files/own-en-synsets.ttl openWordnet-PT/own-files/own-pt-same-as.ttl -o own-pt-synsets.ttl -v
INFO:root:loading data from file 'openWordnet-PT/own-files/own-pt-synsets.ttl'
INFO:root:loading data from file 'openWordnet-PT/own-files/own-en-synsets.ttl'
INFO:root:loading data from file 'openWordnet-PT/own-files/own-pt-same-as.ttl'
INFO:ownpt:start formatting AdjectiveSatelliteSynset
INFO:ownpt:action applied to 10693 synsets
    total: 10693 triples added
    total: 10693 triples removed
INFO:root:serializing output to 'own-pt-synsets.ttl'
arademaker commented 3 years ago

The command interface for pyownpt/cli/adjective_satelites_sameas.py is weired, I would expect a single RDF as input or a directory. It is complicate to specify in each steps the specific fragments of the data that one needs.

arademaker commented 3 years ago

I was not able to check the changes, but I suppose for each adj synset (18156+10693 according to http://wn.mybluemix.net/search?search_field=all&term=), we should have added the right type and removed the wrong one, and changed the URI. So the numbers above are strange. I would expect more changes since the new URI would force more changes in more triples.

fredsonaguiar commented 3 years ago

In fact, in this specific case, we deal only with type definitions considering the relations of OWL:sameAs, so that no URI needs to be replaced globally. This is why we need only those 3 files.

In that case, the lines changed are from own-pt-synsets projecting the types AdjectiveSateliteSynsets from OWN-EN. This is why exactly 10693 synsets, from OWN-PT, are changed.

The resulting changes like:

-<https://w3id.org/own-pt/wn30-pt/instances/synset-00003553-a> a wn30:AdjectiveSynset ;
+<https://w3id.org/own-pt/wn30-pt/instances/synset-00003553-a> a wn30:AdjectiveSatelliteSynset ;
 [...]
-<https://w3id.org/own-pt/wn30-pt/instances/synset-00003700-a> a wn30:AdjectiveSynset ;
+<https://w3id.org/own-pt/wn30-pt/instances/synset-00003700-a> a wn30:AdjectiveSatelliteSynset ;
 [...] 
-<https://w3id.org/own-pt/wn30-pt/instances/synset-00003829-a> a wn30:AdjectiveSynset ;
+<https://w3id.org/own-pt/wn30-pt/instances/synset-00003829-a> a wn30:AdjectiveSatelliteSynset ;
fredsonaguiar commented 3 years ago

we should have added the right type and removed the wrong one, and changed the URI

For modularity, we maintain the problem of redefining the URIs to #176