ProjetPP / PPP-QuestionParsing-Grammatical

Question Parsing module for the PPP using a grammatical approch
GNU Affero General Public License v3.0
33 stars 11 forks source link

Nounification #33

Closed yhamoudi closed 9 years ago

yhamoudi commented 9 years ago

What did John Hinckley do to impress Jodie Foster?

Impress >> printer :o

Ezibenroc commented 9 years ago

Here is the list of words given by NLTK, with their probability:

[('printer', 0.25),
 ('imprint', 0.08333333333333333),
 ('impressment', 0.08333333333333333),
 ('impression', 0.08333333333333333),
 ('printing', 0.08333333333333333),
 ('shanghaier', 0.08333333333333333),
 ('instilling', 0.08333333333333333),
 ('impress', 0.08333333333333333),
 ('affect', 0.08333333333333333),
 ('print', 0.08333333333333333)]

This is sad, because the verb impress means "affect strongly " whereas the noun impress is more related to printing (see wiktionary). We could hope a better nounification for this one (without hardcoded exception).

yhamoudi commented 9 years ago

if we hardcode non-general words (!= questions words, be, have, ...) whe should put the map out of the file where algos are

Ezibenroc commented 9 years ago

Yes. And do the same for dependenciesMap.

yhamoudi commented 9 years ago

I'm not sure it's the same thing. dependenciesMap is a core of the algo. Its size will not increase (contrary to nounifyException that already contains non-usual words). Moreover, dependenciesMap is "special" because certain words are mapped to functions, i'm not sure it's easy to put the map out of the file that contains these functions.

Ezibenroc commented 9 years ago

Yes I agree.