Stanford Dependency returned for Sentence does not match.

dmcc / PyStanfordDependencies

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

https://pypi.python.org/pypi/PyStanfordDependencies

68 stars 17 forks source link

Stanford Dependency returned for Sentence does not match. #18

Closed bonsonsm closed 8 years ago

bonsonsm commented 8 years ago

Hello,

The sample sentence I used is: "Janet had prune juice today before lunch." When I use StanfordCoreNLP in R and run it I get the result:

(ROOT (S (NP (NNP Janet)) (VP (VBD had) (S (VP (VB prune) (NP (NN juice)) (NP-TMP (NN today)) (PP (IN before) (NP (NN lunch)))))) (. .)))

Using pyStanfordDependencies, I get:

(S (NP (NNP Janet)) (VP (VBD had) (VP (VBN prune) (NP (NN juice) (NN today)) (PP (IN before) (NP (NN lunch))))) (. .))

This difference makes it difficult to apply rules to get triples from the sentence. Kindly review. Maybe I am making a mistake somewhere.

Regards, Bonson

dmcc commented 8 years ago

PyStanfordDependencies isn't a parser (just a way of converting trees to dependencies) so I'm confused how you got the second tree. Can you say more about the code you're running?

bonsonsm commented 8 years ago

Hello,

I understand. It is not an issue with the package. The parser output is different. Kindly close this case.

I am using the following parser: parser = stanford.StanfordParser(model_path=".\edu\stanford\nlp\models\lexparser\englishPCFG.ser.gz")

objSentences = parser.raw_parse_sents(strSentenceList)

I tried two other parsers as well:

englishFactored.ser.gz
englishPCFG.caseless.ser.gz

But not getting the required output matching the initial on. Kindly let me know if you have any insights on how to proceed with this.

Thanks and Regards, Bonson

dmcc commented 8 years ago

Sorry, I'm still not sure which Python package you're using to get your trees. Either way, I'm afraid this isn't the right forum for this issue -- please file this issue with the package that stanford.StanfordParser is from (nltk.parse.stanford maybe? In this case, you should head over to NLTK). Hope this helps.