Closed scramblingbalam closed 2 years ago
@scramblingbalam, the information_content() function in wordnet.py is not yet handling adjective satellites. So it is a nltk/nltk issue, because nothing needs to be changed in nltk_data.
After fixing the satellite problem, your example could work on the IC scores that you calculate with the Wordnet library. But it could not work with the wordnet_ic corpora, because these define only Information Content for nouns and verbs, not adjectives:
def ic(self, icfile):
"""
Load an information content file from the wordnet_ic corpus
and return a dictionary. This dictionary has just two keys,
NOUN and VERB, whose values are dictionaries that map from
synsets to information content values.
:type icfile: str
:param icfile: The name of the wordnet_ic file (e.g. "ic-brown.dat")
:return: An information content dictionary
"""
I'm trying to find the Resnik similarity of every word in a sentence to create a measure of sentence similarity and the Resnik Similarity fails on Adjectives based on crowdsourcing narrative intelligence
`import itertools import nltk from nltk.corpus import wordnet as wn
nltk.download()
from nltk.corpus import wordnet_ic
brown_ic = wordnet_ic.ic('ic-brown.dat')
semcor_ic = wordnet_ic.ic('ic-semcor.dat')
from nltk.corpus import genesis genesis_ic = wn.ic(genesis, False, 0.0) print(wn.synsets("daily")) synsetsA = wn.synsets("daily", pos=wn.ADJ) synsetsB = wn.synsets("daily", pos=wn.ADJ) print(synsetsA) print(max(list(i[0].res_similarity(i[1],genesis_ic) for i in itertools.product(synsetsA,synsetsB))))
it throws a key error from WordNet
WordNetError: Information content file has no entries for part-of-speech: s`the full traceback is
the part of speech denoted by 's' is not documented at WordNet How To page nor is the inability to create similarities for some Adjectives or Adjective Satellites. Since it seems that the problem is with Adjective Satellites the error may be related to #2442 I'm not great at reading module code but it looks like it should be handled at wordnet line 2101 to 2104
for ss in possible_synsets: pos = ss._pos if pos == ADJ_SAT: pos = ADJ
based on the variable assignment at line 68ADJ, ADJ_SAT, ADV, NOUN, VERB = "a", "s", "r", "n", "v"