alvations / pywsd

Python Implementations of Word Sense Disambiguation (WSD) Technologies.
MIT License
741 stars 132 forks source link

Fixed similarity_by_path returning None value. #46

Closed cyrilou242 closed 5 years ago

cyrilou242 commented 5 years ago

This pull requests aims to fix the issue https://github.com/alvations/pywsd/issues/31 described by @michael-aloys. (and more marginally https://github.com/alvations/pywsd/issues/41)

In this fix, checks on None value are added. If such case happen, the function now returns no_path_value, which is by default 0. This 0 value follows the behavior that was already implemented in the Leacock-Chodorow similarity case when the similarity was not computable.

To discuss:

I put no_path_value in argument instead of hardcoding 0 because i thought that in some cases one may want this value not to be 0. For instance: to avoid zeroing in a multiplication or dividing by 0 (by setting it to a very small float), or to put it to None to catch it after. Also, the default value 0 of no_path_value may be discussed.

It is important to notice the behavior I implemented for the path_similarity case: Sometimes wn.path_similarity(sense1,sense2) would return a value and wn.path_similarity(sense2,sense1) would return None. If the case happen of one path_similarity returning a number and the other returning None, the path_similarity that gave a number will be kept and returned

alvations commented 5 years ago

Resolve now that we have more controls over how wordnet is access through https://github.com/alvations/wordnet