ProjetPP / PPP-QuestionParsing-Grammatical

Question Parsing module for the PPP using a grammatical approch
GNU Affero General Public License v3.0
33 stars 11 forks source link

“carpool from Lyon to Paris on December 31” #72

Open progval opened 9 years ago

progval commented 9 years ago

http://askplatyp.us/?lang=en&q=carpool+from+Lyon+to+Paris+on+December+31

I'm not sure if there is a way to encode this in the datamodel, though.

Ezibenroc commented 9 years ago

(Lyon,carpool,?)∩(Paris,carpool,?)∩(December 31,carpool,?)

I think it is one of the best possible forms with our current datamodel: the intersection of carpools related to Lyon, Paris and December 31. If we want more precisions, we would have to modify the datamodel...

marc-chevalier commented 9 years ago

I think we can do slightly better. If we consider that '?' has to be the searched trip, we want that '?' is a carpool from Paris to Lyon the December 31st:

(?, instance of, carpool) ∩ (?, from, Paris) ∩ (?, to, Lyon) ∩ (?, day, December 31st)

We can replace 'from' in 'departure' or anything better. Idem: to -> arrival; day->... I don't know.

With this expression, '?' is totally determined. We know that we are looking for a carpool from Paris to Lyon the December 31th. I think we have all information. Moreover, for a prospective carpool module, it's easy to detect whether it has something to do: it just have to search|(?, instance of, carpool)|.

Tha datamodel is powerful! It's not dead!Glory!

Le 27/12/2014 15:32, Tom Cornebize a écrit :

|(Lyon,carpool,?)∩(Paris,carpool,?)∩(December 31,carpool,?)|

I think it is one of the best possible forms with our current datamodel: the intersection of carpools related to Lyon, Paris and December 31. If we want more precisions, we would have to modify the datamodel...

— Reply to this email directly or view it on GitHub https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/issues/72#issuecomment-68179925.

Marc Chevalier

ENS de Lyon Site Monod M1 Informatique Fondamentale

progval commented 9 years ago

What is this pipe notation?

marc-chevalier commented 9 years ago

I don't know where these pipes come from. I don't know if I have to thank Thunderbird or Markdown in Github.

marc-chevalier commented 9 years ago

Just ignore them, they are useless.

Tpt commented 9 years ago

I was going to write the exact same answer as @s-i-newton . So strong +1 to his answer

Ezibenroc commented 9 years ago

(?, instance of, carpool) ∩ (?, from, Paris) ∩ (?, to, Lyon) ∩ (?,day, December 31st)

Yes, good idea!

Here is the tree given by the Stanford library: tmp

It looks good, we could modify slightly the way we handle prep_* edges:

We would have (?,on, December 31st) instead of (?,day, December 31st). I don't think it is a big deal for the carpool module.

marc-chevalier commented 9 years ago

If the carpool module's programmer knows that he has to handle "on" for the day of the trip, I think it is not a problem.

progval commented 9 years ago

“on” is ambiguous and highly language-dependant (what would you do if you had to add French? you can't translate it literally) A “date“ or “time” predicate should be prefered.

On 27/12/2014 17:49, Tom Cornebize wrote:

(?, instance of, carpool) ∩ (?, from, Paris) ∩ (?, to, Lyon) ∩ (?,day, December 31st)

Yes, good idea!

Here is the tree given by the Stanford library: tmp

It looks good, we could modify slightly the way we handle prep_* edges:

  • If it is a prep_and, a prep_or or a prep_of then keep the current rule.
  • Else, for an edge A--prep_X-->B, produce (?, instance of, A) ∩ (?, X, B).

We would have (?,on, December 31st) instead of (?,day, December 31st). I don't think it is a big deal for the carpool module.


Reply to this email directly or view it on GitHub: https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/issues/72#issuecomment-68183527

marc-chevalier commented 9 years ago

Yes, it is. But the context (istance of carpool) determine the meaning. Thus, the module can give the meaning according with its utility.

Ezibenroc commented 9 years ago

A “date“ or “time” predicate should be prefered.

It depends of the context (e.g. "Who walked on the Moon?"). We could use the NER tag (the stuff between the brackets) to choose the predicate for a prep_on edge (date, location, etc.). Moon is not recognized as a location, but it could be fixed by training the library. We could also do the same for "in" prepositions.

Shortly: what you are asking is possible.

“on” is ambiguous and highly language-dependant

"for" and "to" are also language-dependant.

Moreover, we agreed that the translation work had to be done by the backend modules (this is what is doing Wikidata). For instance, we could imagine that the carpool module would have the aliases "de" (French) and "von" (German) for the predicate "for".