alvations / pywsd

Python Implementations of Word Sense Disambiguation (WSD) Technologies.
MIT License
741 stars 132 forks source link

IndexError: list index out of range #22

Closed hwsamuel closed 9 years ago

hwsamuel commented 9 years ago

I get the following error, possibly because no synset exists.

File "lesk.py", line 67, in compare_overlaps return ranked_synsets[0]
IndexError: list index out of range
alvations commented 9 years ago

@hwsamuel , did you get it from using the all-words disambiguation? Which function did you call?

Meanwhile, I'll check through all functions that uses compare_overlaps() and fix it asap.

hwsamuel commented 9 years ago

I used the simple_lesk function, no errors with the disambiguate function

alvations commented 9 years ago

@hwsamuel , could I check with you about which word is causing the error?

I'm wondering whether is it that

  1. there is no synset for the word? or
  2. there is a synset but the ranking of the synset somehow turns into an empty list?

I have been checking but it seems like the first scenario for my corpora.

alvations commented 9 years ago

I have added a check to all Lesk-like functions to return None if word isn't in WordNet.

Also, user can choose to check whether synset exist for a word. Or use a the pywsd word has synset function in pywsd.util.has_synset(), e.g.

from pywsd.utils import has_synset
from pywsd.lesk import simple_lesk
word = 'bing'
context = 'I was using bing to search for WSD software'.
if has_synset(word):
    disambiguated = pywsd.lesk(word, context)
else:
    disambiguated = None    
alvations commented 9 years ago

@hwsamuel does it work for your dataset now?

germanferrero commented 8 years ago

Still broken, same error. Downloaded today.

alvations commented 8 years ago

@germanferrero can you post the code and the text you're trying to disambiguate? Otherwise it's hard to know what error you're getting.