ProjetPP / Documentation

Documentation and protocol specification of the Projet Pensées Profondes
Creative Commons Zero v1.0 Universal
7 stars 1 forks source link

Reverse predicates #52

Closed Ezibenroc closed 9 years ago

Ezibenroc commented 9 years ago

Following the discussion we had here: https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/pull/106

There is an issue about the predicates, well illustrated by the following example:

All these triples are correct. But some of them might fail very often on some databases (e.g. Wikidata).

An arbitrary choice from the question parsing modules would be a poor choice, we need to modify the datamodel.


Proposition 1

A triple with hole is [X,N1,N2] that returns the set of values V such that (X,N1,V) is a correct full triple, or (V,N2,X) is a correct full triple.

Examples:

Where does the animal live? -> [animal,residence,inhabitant] Who lives in the farm? -> [farm,inhabitant,residence]

Proposition 2

Use (X, N1, ?) ∪ (?, N2, X) (do not change the datamodel).

Proposition 3

[X, B1, B2] = { c | ∃ a ∈ X ∃ b ∈ B1 (a, b, c) } ∪ { a | ∃ c ∈ X ∃ b ∈ B2 (a, b, c) }

Proposition 4 (X, B1 ∪ reverse(B2), ?) such that (X, B1 ∪ reverse(B2), ?) = (?, B2 ∪ reverse(B1), X) (we simply add a field in the Triple implementation)

yhamoudi commented 9 years ago

(X, B1 ∪ reverse(B2), ?)

Just a precision: it's not an union in practice but an optional parameter reverse_predicate = B2 (and predicate = B1)

yhamoudi commented 9 years ago

The question parsing has been updated and is ready to use reverse predicates, so it would be good to find an agreement quickly :)

I think we can exclude proposition 2. Moreover, propositions 1 and 3 seem to be the same (but the way proposition 1 is formulated is probably easier to understand), so you can merge them.

The debate starts here: https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/pull/106#issuecomment-73435667. Personally i prefer proposition 1 because there is not the redundancy introduced by proposition 4 (you can represent [X, B1, B2] by 2 different but equivalent triples), and predicates/reverse predicates have the same importance into [X, B1, B2]. But proposition 4 is perhaps easier to understand/use. I don't have a strong opinion on the subject...

Ezibenroc commented 9 years ago

there is not the redundancy introduced by proposition 4

I don't think the redudancy is an issue.

and predicates/reverse predicates have the same importance

We can choose the proposition 4 with a same importance for the predicate ant the reverse predicate (making both of them mandatory in the implementation for instance).

But proposition 4 is perhaps easier to understand/use.

I agree, I prefer proposition 4 for this reason.

yhamoudi commented 9 years ago

I don't think the redudancy is an issue.

It requires to handle 2 kinds of structures ((A,B,?) and (?,B,C)) instead of 1 ([X, B1, B2]).

Moreover, it's "unsightly". There are 3 different triples with hole whereas only one of them is necessary:

[X, B1, B2] is just a way to rewrite (A,B,?) without the hole (since in practice (A,B,?) = (A,pred=B,reverse_pred=reverse(B),?) = [A,B,reverse(B),?]).

I agree, I prefer proposition 4 for this reason.

I'm not totally convinced that it's more easy to understand. Using [X, B1, B2], there is only one notation to understand, whereas keeping triples needs to understand the meaning of (A,pred=B1,reverse_pred=reverse(B1),?) and (?,pred=B2,reverse_pred=reverse(B2),C).

For instance, people need to understand that (?,pred='live in',reverse_pred='inhabitant',farm) = (farm,inhabitant,?) ∪ (?,live in, farm) and (animal,pred='live in',reverse_pred='inhabitant',?) = (animal,live in,?) ∪ (?,inhabitant, animal). Concerning [X,B1,B2] there is only one way: [animal,live in, inhabitant] = (animal,live in,?) ∪ (?,inhabitant, animal) and [farm,inhabitant,live in] = (farm,inhabitant,?) ∪ (?,live in, farm).


I think most of the pro/cons have been exposed. If i'm the only one (?) to prefer [X, B1, B2], let's choose proposition 4.

yhamoudi commented 9 years ago

After more consideration, I agree with proposition 4 :ok_hand: I think that the triple representation is more familiar for the people, and they could be discouraged by an unknown notation [X, B1, B2]...

yhamoudi commented 9 years ago

A draft from @Tpt : https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical/pull/106#issuecomment-73658239