own-pt / own-en-legacy

The openWordnet-EN, a converted and expanded PWN
MIT License
0 stars 1 forks source link

Rethink unique sense ids for satellite adjectives #13

Closed fcbr closed 7 years ago

fcbr commented 7 years ago

In adjs.all.txt:

$ grep "w: zippy" adjs.all.txt 
w: zippy drf noun.attribute:zip
w: zippy
fcbr commented 7 years ago

These do have lexid 0, which explains the lack of suffixes. They are also "disambiguated" as senses like this:

zippy%5:00:00:energetic:00 00874226 1 0
zippy%5:00:00:lively:00 00805309 2 0

So we may need to follow suit somehow.

fcbr commented 7 years ago

OK, the problem is restricted to satellite adjectives and there are A LOT of them. We need to rethink our strategy for unique sense ids for them.

fcbr commented 7 years ago

I have a solution do this, but the simplest implementation requires fixing ALL lex_ids instead of "fixing" the duplicated ones. The side effect to this is that we need to fix the equivalence mapping to sense keys. Working on that next.

fcbr commented 7 years ago

https://github.com/own-pt/wordnet-dsl/commit/808a1a3f65e13962b9cbac08ff6d925af25f917c fixes this but needs more testing before updating the files.

arademaker commented 7 years ago

Why did we get changes in entries without any link to adjs?

fcbr commented 7 years ago

Like I said the simpler solution was simply to regenerate all sense ids. As long as we are breaking compatibility with PWN, this is not a problem since we have a mapping file anyway.

Sent from my iPhone

On 30 Aug 2017, at 18:06, Alexandre Rademaker notifications@github.com wrote:

Why did we get changes in entries without any link to adjs?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.