How to easily get VP/NP/PP phrase?

bayesrule commented 2 years ago

Hi,

I'm new to this great tool. Besides successful parsing, is there any convenient way to get some useful elements (e.g. none phrases in the input) from the parsing result?

yzhangcs commented 2 years ago

@bayesrule Hi, the results predicted by the parser are just stored in the form of nltk.Tree. So you can access nonterminals simply via nltk APIs:

>>> par = Parser.load('crf-con-en')
>>> tree = par.predict("She enjoys playing tennis .".split())[0].trees
>>> tree.productions()
[TOP -> S, S -> NP VP _, NP -> _, _ -> 'She', VP -> _ S, _ -> 'enjoys', S -> VP, VP -> _ NP, _ -> 'playing', NP -> _, _ -> 'tennis', _ -> '.']

Besides, supar also provides some useful fns for factorization:

>>> supar.utils.Tree.factorize(tree)
[(0, 5, 'TOP'), (0, 5, 'S'), (0, 1, 'NP'), (1, 4, 'VP'), (2, 4, 'S'), (2, 4, 'VP'), (3, 4, 'NP')]

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 years ago

This issue was closed because it has been inactive for 7 days since being marked as stale.

yzhangcs / parser

How to easily get VP/NP/PP phrase? #97