globalwordnet / OMW

The Open Multilingual Wordnet
http://compling.hss.ntu.edu.sg/omw/
MIT License
58 stars 9 forks source link

Some information about senses is not stored #63

Open fcbond opened 5 years ago

fcbond commented 5 years ago

I just checked and we are missing a few things from the database:

    324 also_sees
     15 region_domains
     10 topic_domains
    398 usage_domains
   1055 syntactic_marker
      2 verb_groups

NLTK does not read participles, so we are also missing them!

*_domain links to lemmas, but I think it should link to synsets

syntactic markers are for adjectives:

(p) predicate position (a) prenominal (attributive) position (ip) immediately postnominal position

I think we should make three new entries (attributive adjective, predicative adjective and postpositive adjective: check Huddleston for names).
Then we could link with exemplifies. We should cross-check with the ERG

We have no examples of participles (but e.g. elapsed/elapse should be), similar or other.

goodmami commented 5 years ago

These syntactic markers, or at least the descriptions, seem rather English-specific. For instance, it is not necessarily the case that attributive adjectives are prenominal (such as in Spanish or French), and it would be odd to have postpositive without prepositive. And even for English, what is the intended use? To mark attributives that cannot vary their position (such as main or ago)?

fcbond commented 5 years ago

G'day,

Yes, these are are language specific (which is why they should be linked to senses). We don;t have prepositive as it is the default in English (so these are marked subtypes). And, as you say, they are used to mark things such as main or ago.

On Wed, Apr 3, 2019 at 1:51 PM Michael Wayne Goodman < notifications@github.com> wrote:

These syntactic markers, or at least the descriptions, seem rather English-specific. For instance, it is not necessarily the case that attributive adjectives are prenominal (such as in Spanish or French), and it would be odd to have postpositive without prepositive. And even for English, what is the intended use? To mark attributives that cannot vary their position (such as main or ago)?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/globalwordnet/OMW/issues/63#issuecomment-479349667, or mute the thread https://github.com/notifications/unsubscribe-auth/ABD8xqAVZOGU3rtH4J4WWDV96MJdb4EOks5vdEFegaJpZM4cPaAy .

-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University

goodmami commented 5 years ago

We don;t have prepositive as it is the default in English (so these are marked subtypes)

That is my point. In some other languages postpositive would be the default, so it would be improper, I think, to define a postpositive marker without the option for a prepositive marker, even if you only mark the non-default case on senses for each language.

Or perhaps I'm not understanding what syntactic markers are or what you mean by "make three new entries"...

fcbond commented 5 years ago

These are already in the Princeton Wordnet database, but as a special case. So I want to make sure that we lose no data.

We can make the definitions specific to English: (and add a domain link!). I want to NOT have special marker, but just link to synsets.

On Wed, Apr 3, 2019 at 4:00 PM Michael Wayne Goodman < notifications@github.com> wrote:

We don;t have prepositive as it is the default in English (so these are marked subtypes)

That is my point. In some other languages postpositive would be the default, so it would be improper, I think, to define a postpositive marker without the option for a prepositive marker, even if you only mark the non-default case on senses for each language.

Or perhaps I'm not understanding what syntactic markers are or what you mean by "make three new entries"...

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/globalwordnet/OMW/issues/63#issuecomment-479382171, or mute the thread https://github.com/notifications/unsubscribe-auth/ABD8xihWZSUOIJv6_CkcfsbikvFp8YMSks5vdF-AgaJpZM4cPaAy .

-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University

arademaker commented 5 years ago

Regarding relations , I noted some time ago that some relations are between synsets and senses , but Francis surely know that! Just helping to ensure no data will be lost.

fcbond commented 5 years ago

Can you give some concrete examples? I could not find in PWN (but it could be we lost them somewhere)

arademaker commented 5 years ago

Sorry, my mistake. In our paper http://arademaker.github.io/files/gwc-2016-icv.pdf we discussed that range and domain of some relations in the WN docs are not precisely defined (section 5.2).

The following pointer types are usually used to indicate lexical relations: Antonym, Pertainym, Participle, Also See, Derivationally Related. The re- maining pointer types are generally used to represent semantic relations.

But double-checking now, I confirmed that we don't have sense-synset nor synset-sense for them, but we do have both sense-sense and synset-synset.

https://ibm.co/2UeyZhl

fcbond commented 5 years ago

Thanks for checking.

I think we should allow sense-synset, but the PWN database did not allow for it. Instead they link a sense to all of the senses of some synsets!

I am hoping we can introduce it in OMW, ...

arademaker commented 5 years ago

Ah, that is another way to see. You are saying that the proper translation to RDF would be to propagate some synset1-synset2 relations to all-senses-synset-1 to all-senses-synset-2, is that right?

But as you can see in this other query https://ibm.co/2FY6VGp for the classifiedBy* and frame relations that we mentioned in the article, the sense-sense relations are exceptions

?REL, ?domain, ?range, ?count
classifiedByRegion,WordSense,WordSense,15
frame,WordSense,String,365
classifiedByTopic,WordSense,WordSense,11
classifiedByUsage,WordSense,WordSense,409

Or are you saying that it is in the other way around? That is, some sense-sense relations should be collapsed to synset1-synset2 whenever we confirm that all senses of synset1 are linked to all senses of synset2 by that relation.

goodmami commented 5 years ago

We can make the definitions specific to English: (and add a domain link!). I want to NOT have special marker, but just link to synsets.

Ah ok, so map the PWN syntactic marker annotations to OMW sense-synset links.

Still, I can imagine if one sees (postpositive, is_exemplified_by, "ago") they might want to find what is exemplified by "prepositive" as well. It would help if we could negate subqueries, e.g., those items that are not exemplified by "postpositive" (which would include unconstrained and strictly prepositive examples).

fcbond commented 5 years ago

I see your point, but I think almost all English adjectives are prepositive, right?

I guess French of Indonesian might have more interesting examples.

arademaker commented 5 years ago

In Portuguese, the prepositive usually has a figurative meaning and postpositive usually has the literal meaning.