Closed AdamNix closed 7 years ago
Hi Adam,
Are you meaning just
$result = $parser->parseSentence("What does the fox say?");
var_dump($result['penn']);
Or are you wanting for $parser->parseSentence("What does the fox say?");
to just do the penn work doing something like $parser->parseSentence("What does the fox say?", ['penn']);
?
FWIW , I do plan to revisit this project and address a lot of the issues people have been having and make it more up to date.
Thanks for the reply, Anthony. I believe at lot of the issues revolve around trying to sync three areas, Java updates/PHP/Stanford Parser updates. e.g When I run either of the suggestions above I get:
array(2) { ["parent"]=> NULL ["children"]=> array(0) { } }
I am using PHP 7.1, and Stanford Parser: 'C:\stanford-parser-full-2015-04-20\stanford-parser.jar', 'C:\stanford-parser-full-2015-04-20\stanford-parser-3.5.2-models.jar'
Using the original example, I flattened the array and would be able to get rid of the 'words and tags' using a search for '(ROOT' The Universal dependencies(typed dependencies) are a little more difficult.
@AdamNix Hmm, yeah, I am not sure why you are getting an array like that.
I just updated the repo, retesting against stanford version 3.8.0 ... here is what I am running Java and PHP wise
agentile@agentile:~/php-stanford$ java -version
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.2-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
agentile@agentile:~/php-stanford$ php -v
PHP 7.0.18-0ubuntu0.16.04.1 (cli) ( NTS )
Copyright (c) 1997-2017 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2017 Zend Technologies
with Zend OPcache v7.0.18-0ubuntu0.16.04.1, Copyright (c) 1999-2017, by Zend Technologies
Here is what I ran https://github.com/agentile/PHP-Stanford-NLP/blob/master/examples/stanford.php
Notice the commented out bits, if you setDebug, it will echo out the actual command sent behind the scenes, e.g.
agentile@agentile:~/php-stanford/examples$ php stanford.php
DEBUG: Command used: java -mx300m -cp "/home/agentile/php-stanford/stanford-parser-full-2017-06-09/stanford-parser.jar:/home/agentile/php-stanford/stanford-parser-full-2017-06-09/stanford-parser-3.8.0-models.jar" edu.stanford.nlp.parser.lexparser.LexicalizedParser -encoding UTF-8 -outputFormat "wordsAndTags,penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz /tmp/phpnlpparserYRVcyk
If you set $parser->setOutputFormat('penn');
Then it will only do that format, and the values for wordsAngTags
and typedDependencies
will be null
You should be getting back an associative array.
So if you can possibly do a few things:
Tell me if you are running on unix or windows and if you can copy paste more of the code that you are running, as it is hard for me to debug clearly on your behalf without more information. Can you also update to stanford version 3.8.0?
agentile@agentile:~/php-stanford/examples$ java -mx300m -cp "/home/agentile/php-stanford/stanford-parser-full-2017-06-09/stanford-parser.jar:/home/agentile/php-stanford/stanford-parser-full-2017-06-09/stanford-parser-3.8.0-models.jar" edu.stanford.nlp.parser.lexparser.LexicalizedParser -encoding UTF-8 -outputFormat "wordsAndTags,penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz /tmp/test
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.7 sec].
Parsing file: /tmp/test
Parsing [sent. 1 len. 6]: What does the fox say ?
What/WP does/VBZ the/DT fox/NN say/VB ?/.
(ROOT
(SBARQ
(WHNP (WP What))
(SQ (VBZ does)
(NP (DT the) (NN fox))
(VP (VB say)))
(. ?)))
dobj(say-5, What-1)
aux(say-5, does-2)
det(fox-4, the-3)
nsubj(say-5, fox-4)
root(ROOT-0, say-5)
Parsed file: /tmp/test [1 sentences].
Parsed 6 words in 1 sentences (9.63 wds/sec; 1.61 sents/sec).
I'm using Windows 7 Pro I've updated to stanford version 3.8.0. I have the Stanford Parser and Postagger both zipped and unzipped in a file called Stanford. My PHP file is there too. When I run the below program for $results[penn] I get an empty array,
` /**
*/ class Base { /**
/**
*/ class Exception extends \Exception { }
/**
/**
/**
/**
// autoload mimicks https://github.com/auraphp spl_autoload_register(function ($class) { // the package namespace $ns = 'StanfordNLP'; // what prefixes should be recognized? $prefixes = array( "{$ns}\" => array( DIR . '/src/' . $ns, ), ); // go through the prefixes foreach ($prefixes as $prefix => $dirs) { // does the requested class match the namespace prefix? $prefix_len = strlen($prefix); if (substr($class, 0, $prefix_len) !== $prefix) { continue; } // strip the prefix off the class $class = substr($class, $prefix_len); // a partial filename $part = str_replace('\', DIRECTORY_SEPARATOR, $class) . '.php'; // go through the directories to find classes foreach ($dirs as $dir) { $dir = str_replace('/', DIRECTORY_SEPARATOR, $dir); $file = $dir . DIRECTORY_SEPARATOR . $part; if (is_readable($file)) { require $file; return; } } } });
//Adam This below code comes back with no result.
// assume composer autoload require_once dirname(dirname(FILE)) . DIRECTORY_SEPARATOR . 'vendor' . DIRECTORY_SEPARATOR . 'autoload.php'; $path = dirname(dirname(FILE)) . DIRECTORY_SEPARATOR . 'stanford-parser-full-2017-06-09'; $parser = new \StanfordNLP\Parser( $path . DIRECTORY_SEPARATOR . 'stanford-parser.jar', $path . DIRECTORY_SEPARATOR . 'stanford-parser-3.8.0-models.jar' ); //$parser->setDebug(true); //$parser->setOutputFormat('penn'); //$result = $parser->parseSentence("What does the fox say?"); $result = $parser->parseSentences(["What does the fox say?", "Hi bob, how are you?"]); var_dump($result);`
//Adam The below code gets me the words&tags, Penn, typed dependencies, but all in one array and it cannot be separated easily. $parser = new \StanfordNLP\PaC:\stanford\stanford-parser-full-2018-06-09\stanford-parser.C:\stanford\stanford-parser-full-2018-06-09\stanford-parser-3.5.2-models.jar' );
var_dump($result);
Adding $parser->setOutputFormat('penn'); seems to have done the trick. Here is the code:
$parser = new \StanfordNLP\Parser( 'C:\stanford\stanford-parser-full-2017-06-09\stanford-parser.jar', 'C:\stanford\stanford-parser-full-2017-06-09\stanford-parser-3.8.0-models.jar' );
//var_dump($result); $parser->setOutputFormat('penn'); $result = $parser->parseSentence("What does the fox say?"); var_dump($result['penn']);
Here is the output: StanSmitharray(2) { ["parent"]=> string(4) "ROOT" ["children"]=> array(1) { [0]=> array(2) { ["parent"]=> string(5) "SBARQ" ["children"]=> array(3) { [0]=> array(2) { ["parent"]=> string(4) "WHNP" ["children"]=> array(1) { [0]=> array(2) { ["parent"]=> string(7) "WP What" ["children"]=> array(0) { } } } } [1]=> array(2) { ["parent"]=> string(2) "SQ" ["children"]=> array(3) { [0]=> array(2) { ["parent"]=> string(8) "VBZ does" ["children"]=> array(0) { } } [1]=> array(2) { ["parent"]=> string(2) "NP" ["children"]=> array(2) { [0]=> array(2) { ["parent"]=> string(6) "DT the" ["children"]=> array(0) { } } [1]=> array(2) { ["parent"]=> string(6) "NN fox" ["children"]=> array(0) { } } } } [2]=> array(2) { ["parent"]=> string(2) "VP" ["children"]=> array(1) { [0]=> array(2) { ["parent"]=> string(6) "VB say" ["children"]=> array(0) { } } } } } } [2]=> array(2) { ["parent"]=> string(3) ". ?" ["children"]=> array(0) { } } } } } } [Finished in 3.5s]
Many Thanks, Anthony,
Hi
Is there a way to output only the Penn results?
Wonderful program.