ProjetPP / PPP-QuestionParsing-Grammatical

Question Parsing module for the PPP using a grammatical approch
GNU Affero General Public License v3.0
33 stars 11 forks source link

let's standardize! #16

Closed yhamoudi closed 10 years ago

yhamoudi commented 10 years ago

A first way to produce a standardized tree from triples.

treeTranslation.py contains the algo. I think some code could be factorized (but i have errors whenever i try something)

If you want to try it:

If someone is comfortable with the filling of requesthandler.py (and the other parts to communicate with external modules), please do it :)

progval commented 10 years ago

you shouldn't have to edit ppp_nlp_classical/__init__.py, this is the file used to expose stuff to external software, and only the app is enough.

progval commented 10 years ago

And the communication protocol has been changing a lot yesterday and today, so I suggest we wait a day or two before starting working on the implementation of requesthandler. All I need to implement it is a method that takes a string as input and returns:

Ezibenroc commented 10 years ago

Valentin -> We have modified __init__.py to do our unit tests.

progval commented 10 years ago

I tried to type four different questions, and all of them get exit: question word has child (please, report your sentence on http://goo.gl/EkgO5l) as a reply. I am using it wrong?

Ezibenroc commented 10 years ago

What questions? What script?

progval commented 10 years ago

demo6.py, and the questions I wrote on the pad:

Who am I?
Who are you?
What is my birth date?
What is the birth date of the first president of the United States?
yhamoudi commented 10 years ago

What branch did you used? All of these questions work on triple_standardize (but is seems there is a bug on non_question_word_questions branch)

Ezibenroc commented 10 years ago

I think that you did not launch the server with the option for the copula. Use this to launch the server:

CORENLP="stanford-corenlp-full-2014-08-27" CORENLP_OPTIONS="-parse.flags \" -makeCopulaHead\"" python3 -m corenlp

(eventually replacing the option CORENLP by the appropriate directory)

progval commented 10 years ago

I used triple_standardize.

Ezibenroc commented 10 years ago

With the above command line to launch the server, the four questions are ok for me.

progval commented 10 years ago

It works for How old are you?, though

progval commented 10 years ago

oh I see, I forgot the copula stuff

progval commented 10 years ago

That format looks quite good, very nice!

However, I am not sure I understand this output:

progval@ganymede:~/etudes/ens/ppp/nlp-classical(triple_standardize)$ echo "Who are you?" | python3 demo/demo6.py 
{
    "object": {
        "type": "missing"
    },
    "subject": {
        "object": {
            "value": "you",
            "type": "resource"
        },
        "subject": {
            "type": "missing"
        },
        "type": "triple",
        "predicate": {
            "value": "are",
            "type": "resource"
        }
    },
    "type": "triple",
    "predicate": {
        "value": "identity",
        "type": "resource"
    }
}

Should the whole first-level subject be {"type": "resource", "value": "you"}?

Ezibenroc commented 10 years ago

Yassine: for the question Who are you?, the tree before simplification has a node are as ROOT child. I think that we should add the following as question words: Who is, Who are, Who am, Who were, Who was...

Other solution: do not launch the server with copula option, and merge the copula dependency.

yhamoudi commented 10 years ago

I don't understand the problem (2 last comments). You want another triples for Who are you ?

progval commented 10 years ago

Yes, I think this one would make more sens:

{
    "object": {
        "type": "missing"
    },
    "subject": {
        "value": "you",
        "type": "resource"
    },
    "type": "triple",
    "predicate": {
        "value": "identity",
        "type": "resource"
    }
}
yhamoudi commented 10 years ago

No, I think it's relevant. The second level will identify you (Valentin) and the first one will provide a description/identity of you.

In the case of Who questions it can seem a bit too difficult. But if you replace Who by Where it performs just as you want : ((?,are,you), location, ?)

The current algorithm just removes the question word, product some triples, and then add an extra triple depending on the question word (see https://github.com/ProjetPP/PPP-NLP-classical/issues/14).

But perhaps it could be interessant to add question words such as Who are in order to delete a level.

progval commented 10 years ago

Oh, I see

Ezibenroc commented 10 years ago

We should merge if you are ok (and do a new branch for the part about the requestHandler).

yhamoudi commented 10 years ago

as you want, i've nothing to add on this branch