agentile / PHP-Stanford-NLP

PHP interface to Stanford NLP tools (POS Tagger, NER, Parser)
168 stars 51 forks source link

Parser parseSentence output is not nicely broken out like the sample in the documentation, it is all glommed together into a single array #12

Open chilipepper987 opened 9 years ago

chilipepper987 commented 9 years ago

I am using 3.4.1 version of the parser jar and parser models jar. When parsing the sentence "What does the fox say?" instead of a nice result divided into wordsAndTags, penn, and typedDependencies, I get one giant array that seems to have the output of all 3 categories for the wordsAndTags result, and the penn and typedDependencies are empty. Also, the part of the output that looks like the penn results are not broken down into parent/children, and the part of the output that looks like the typedDependencies results are not broken down by feature/index.

Since I am using the same version of the jar files, I can't figure out why I am getting different output. I want the same output as the doc; the output I am getting would be much more difficult to work with. Here is the full output I am getting:

$parserJAR = getcwd() . ".\\models\\parse\\stanford-parser.jar";
$parser = getcwd() . ".\\models\\parse\\stanford-parser-3.4.1-models.jar";
$sentence = "What does the fox say?";
$pos = new \StanfordNLP\Parser($parserJAR, $parser);
$parseResult = $pos->parseSentence($sentence)
echo print_r($parseResult, true);
 Array
(
    [wordsAndTags] => Array
        (
            [0] => Array
                (
                    [0] => What
                    [1] => WP
                )

            [1] => Array
                (
                    [0] => does
                    [1] => VBZ
                )

            [2] => Array
                (
                    [0] => the
                    [1] => DT
                )

            [3] => Array
                (
                    [0] => fox
                    [1] => NN
                )

            [4] => Array
                (
                    [0] => say
                    [1] => VB
                )

            [5] => Array
                (
                    [0] => ?
                    [1] => .

(ROOT

                )

            [6] => Array
                (
                    [0] => 
                    [1] => 
                )

            [7] => Array
                (
                    [0] => 
                    [1] => (SBARQ

                )

            [8] => Array
                (
                    [0] => 
                    [1] => 
                )

            [9] => Array
                (
                    [0] => 
                    [1] => 
                )

            [10] => Array
                (
                    [0] => 
                    [1] => 
                )

            [11] => Array
                (
                    [0] => 
                    [1] => (WHNP
                )

            [12] => Array
                (
                    [0] => 
                    [1] => (WP
                )

            [13] => Array
                (
                    [0] => 
                    [1] => What))

                )

            [14] => Array
                (
                    [0] => 
                    [1] => 
                )

            [15] => Array
                (
                    [0] => 
                    [1] => 
                )

            [16] => Array
                (
                    [0] => 
                    [1] => 
                )

            [17] => Array
                (
                    [0] => 
                    [1] => (SQ
                )

            [18] => Array
                (
                    [0] => 
                    [1] => (VBZ
                )

            [19] => Array
                (
                    [0] => 
                    [1] => does)

                )

            [20] => Array
                (
                    [0] => 
                    [1] => 
                )

            [21] => Array
                (
                    [0] => 
                    [1] => 
                )

            [22] => Array
                (
                    [0] => 
                    [1] => 
                )

            [23] => Array
                (
                    [0] => 
                    [1] => 
                )

            [24] => Array
                (
                    [0] => 
                    [1] => 
                )

            [25] => Array
                (
                    [0] => 
                    [1] => (NP
                )

            [26] => Array
                (
                    [0] => 
                    [1] => (DT
                )

            [27] => Array
                (
                    [0] => 
                    [1] => the)
                )

            [28] => Array
                (
                    [0] => 
                    [1] => (NN
                )

            [29] => Array
                (
                    [0] => 
                    [1] => fox))

                )

            [30] => Array
                (
                    [0] => 
                    [1] => 
                )

            [31] => Array
                (
                    [0] => 
                    [1] => 
                )

            [32] => Array
                (
                    [0] => 
                    [1] => 
                )

            [33] => Array
                (
                    [0] => 
                    [1] => 
                )

            [34] => Array
                (
                    [0] => 
                    [1] => 
                )

            [35] => Array
                (
                    [0] => 
                    [1] => (VP
                )

            [36] => Array
                (
                    [0] => 
                    [1] => (VB
                )

            [37] => Array
                (
                    [0] => 
                    [1] => say)))

                )

            [38] => Array
                (
                    [0] => 
                    [1] => 
                )

            [39] => Array
                (
                    [0] => 
                    [1] => 
                )

            [40] => Array
                (
                    [0] => 
                    [1] => 
                )

            [41] => Array
                (
                    [0] => 
                    [1] => (.
                )

            [42] => Array
                (
                    [0] => 
                    [1] => ?)))

dobj(say-5,
                )

            [43] => Array
                (
                    [0] => 
                    [1] => What-1)
aux(say-5,
                )

            [44] => Array
                (
                    [0] => 
                    [1] => does-2)
det(fox-4,
                )

            [45] => Array
                (
                    [0] => 
                    [1] => the-3)
nsubj(say-5,
                )

            [46] => Array
                (
                    [0] => 
                    [1] => fox-4)
root(ROOT-0,
                )

            [47] => Array
                (
                    [0] => 
                    [1] => say-5)
                )

        )

    [penn] => Array
        (
            [parent] => 
            [children] => Array
                (
                )

        )

    [typedDependencies] => Array
        (
        )

)
AdamNix commented 7 years ago

Did you ever solve this?

chilipepper987 commented 7 years ago

@AdamNix No! I pretty much gave up on this repoafter not hearing back on this issue report. It seems like this project is abandonware? I've been looking at some nodejs solutions that seem promising but I'd rather be able to do it from php.

agentile commented 7 years ago

@chilipepper987 Hi , I do intend to spend some time to address issues and update this repo. I know it does get a lot of attention. Sadly, finding the time with a full time job and children can be difficult. I am not going to give definite time frames, but it is something I want to address soon.