Closed simongray closed 2 years ago
We chose this because there were some values, e.g., adjective_satellite
that aren't in LexInfo and wouldn't really make sense to add. We wanted to avoid mixing namespaces also (e.g., lexinfo:noun
with wn:adjective_satellite
)
However, adding some owl:sameAs
links to LexInfo would be very useful!
Ok, then I will make an attempt at that for 1.2.
Having taken a look at it, there doesn't seem to be a way to make it compatible in a satisfactory way. Defining owl:sameAs
doesn't map the actual relations, just the instances.
Lexinfo unfortunately has a definite range (the lexinfo:PartOfSpeech
class) so I'm unsure whether it's even possible to define the wn:partOfSpeech
as a sub-property of lexinfo:PartOfSpeech
while extending its range to encompass both PartOfSpeech classes. It doesn't seem like it will be possible.
Not sure I get what the issue is here. I would assume that wn:PartOfSpeech
⊑ lexinfo:PartOfSpeech
for both the class and the property?
I am not OWL expert, that is probably the main issue ;-)
What you're saying is true. I guess owl:unionOf
could be used to define the 1.2 wn:partOfSpeech
range to be both wn:PartOfSpeech
and lexinfo:PartOfSpeech
? Still, not using lexinfo:partOfSpeech
directly does make it harder to integrate directly with other Ontolex datasets.
If you add a subclass axiom between wn:PartOfSpeech
and lexinfo:PartOfSpeech
then there is not need for a unionOf
statement, every wn:PartOfSpeech
is also a lexinfo:PartOfSpeech
so wn:PartOfSpeech
⊔ lexinfo:PartOfSpeech
≡ lexinfo:PartOfSpeech
.
That is if we add
wn:partOfSpeech owl:subPropertyOf lexinfo:partOfSpeech .
wn:PartOfSpeech owl:subClassOf lexinfo:PartOfSpeech .
wn:noun owl:sameAs lexinfo:noun . # etc.
Then if we have
X wn:partOfSpeech wn:noun
We then infer
X lexinfo:partOfSpeech lexinfo:noun
Ah, that makes a lot of sense. I was stuck thinking that of course the subProperty can't extend the range of it's parent property, but yeah... if we define everything as subclasses it will technically not be doing that. Thanks a lot for explaining in this detail.
I do wonder what to do about the owl:oneOf
relation from the wn:PartOfSpeech
currently defined in the schema. I believe that the set of POS tags in lexinfo is more extensive - or at least not finite - so I wonder if it makes sense to also remove this restriction if, say, you wanted to use a lexinfo:PartOfSpeech
as the object of a wn:partOfSpeech
relation...? What are your thoughts on this?
...
As for the actual inference, I think I will also have to take another look at the Prolog-like logic rule DSL used in Apache Jena to make sure it applies the same kind of reasoning you just described in practice. The default level of OWL inference was a bit too comprehensive (and therefore slow) on the DanNet data, so to make it snappier I basically removed all statements that didn't infer inverse relationships ;-)
I do wonder what to do about the owl:oneOf relation from the wn:PartOfSpeech currently defined in the schema. I believe that the set of POS tags in lexinfo is more extensive - or at least not finite - so I wonder if it makes sense to also remove this restriction if, say, you wanted to use a lexinfo:PartOfSpeech as the object of a wn:partOfSpeech relation...? What are your thoughts on this?
It would not be compatible with this schema to use a value of part-of-speech other than ones specified. We need this to ensure interoperability. (Of course we are open to proposals for new values)
I'm sorry @jmccrae, but I have couple of remaining questions for things that are still not quite clear to me...
Adjective
vs adjective
? It is not immediately clear to me.adjective_satellite
to lexinfo and use lexinfo directly? Is it because adjective_satellite
is non-standard...?Can you tell me what the distinction is between e.g. the capitalised and all-lowercase versions of POS tags in lexinfo, e.g. Adjective vs adjective? It is not immediately clear to me.
Essentially capitalised names are for classes and lower case for values. So Adjective
is a subclass of LexicalEntry
, while adjective
is the value of part-of-speech property. The following equivalence basically holds
X rdf:type ontolex:LexicalEntry and X lexinfo:partOfSpeech lexinfo:adjective <=> X rdf:type lexinfo:Adjective
Since you're involved with writing both this schema and the lexinfo one, how come you don't just add e.g. adjective_satellite to lexinfo and use lexinfo directly? Is it because adjective_satellite is non-standard...?
Exactly
Scanning through the Turtle file, I noticed that you define your own POS relations and classes rather than use the
lexinfo:partOfSpeech
relation which is heavily used in the Ontolex specification, which I understand that @jmccrae helped bring to life. I'm unsure why this is the case?In the Ontolex specification it is specifically stated that
I think that this is an excellent ideal as it makes integration of existing datasets mostly a matter of merging sets of triples. The second best option would be having some kind of derived
lexinfo
relation triple which can be inferred via equivalent/subclass relations.Unfortunately, the GWA schema's
partOfSpeech
relation andPartOfSpeech
class are proprietary and not linked to any external definitions. I have used Ontolex as the basis for the new version of DanNet, so my part-of-speech tags are all defined usinglexinfo:partOfSpeech
relation rather thanwn:partOfSpeech
.How do you suggest we bridge this gap? The way I see it, either version 1.2 of the schema removes this bit and datasets use
lexinfo:partOfSpeech
directly -OR- a direct equivalency tolexinfo:partOfSpeech
is established in the schema -- preferably the first as it simplifies things.I could also add both
wn
andlexinfo
relations for allLexicalEntry
classes in the new DanNet, but that's both confusing and a messy fix IMO. Better to fix the schema than work around its flaws. Having competing standards for this is not a great situation.The relevant part of the schema: