Closed goodmami closed 2 years ago
OK, I rebuilt the mapping (ili-map-pwn31.tab
) as follows:
grep sameAs ../cili/ili-map-wn31.ttl | cut -d ' ' -f1,3 --output-delimiter=$' ' | sed s/pwn31:[12345]// | sed s/ili://| sort -nk 1.2 > ili-map-pwn31.tab
This keeps only the sameAs
links.
Somehow the file had null (\0
) delimiters instead of tab delimiters between the fields. The following command produces tabs:
$ sed -rn '/owl:sameAs/{s/ili:([^ ]*) owl:sameAs pwn31:[1-5]([^ ]*) .*/\1\t\2/;p}' ili-map-wn31.ttl | sort -nk 1.2 > ili-map-pwn31.tab
The sort
command is unnecessary; the same results are obtained without it as the Turtle file is already sorted in this order. But it's also good to be explicit.
I'll check in the new file.
I checked with ediff-buffers, don't know how that crept in.
Thanks for the fix.
On Thu, Nov 4, 2021 at 7:45 AM Michael Wayne Goodman < @.***> wrote:
Somehow the file had null (\0) delimiters instead of tab delimiters between the fields. The following command produces tabs:
$ sed -rn '/owl:sameAs/{s/ili:([^ ]) owl:sameAs pwn31:[1-5]([^ ]) .*/\1\t\2/;p}' ili-map-wn31.ttl | sort -nk 1.2 > ili-map-pwn31.tab
The sort command is unnecessary; the same results are obtained without it as the Turtle file is already sorted in this order. But it's also good to be explicit.
I'll check in the new file.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/globalwordnet/cili/issues/13#issuecomment-960299325, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRUVZQ2I7M7UOTD2RGTUKHJTXANCNFSM5HHFKPKA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University
Yeah, not sure. I tried out your commands and it seemed to work. Strange.
There are some ILI links with
skos:closeMatch
in the Turtle file for the PWN31 mapping but these are showing up in the corresponding tab file simply as ILI links. These should be removed or fixed somehow.It looks like the
skos:closeMatch
ones usually have the ILIin
in the OEWN:I tried to determine if this is always the case:
So it looks like 3 of 4
closeMatch
ILI links are parallel to asameAs
link to the same synset. The other one only hadcloseMatch
links and nosameAs
to the same synset for two ILIs. That synset hasili="in"
(it wasn't excluded before because there were two ILIs (i50032
andi50034
) which pointed to the same synset, andcomm
did not exclude the second instance). Digging into that one further:We don't have a precedent or a good way to add introduced ILIs to the mapping. Assigning
ili="in"
is done for a wordnet project, and here we should generate new IDs. So I suppose all of thesecloseMatch
cases should simply be dropped from the.tab
files?