stephbuon / posextract

Grammatical information extraction methods designed for the analysis of historical and contemporary textual corpora.
MIT License
3 stars 0 forks source link

Extract triples (also an example of extracting obj-verb-obj) #178

Open stephbuon opened 1 year ago

stephbuon commented 1 year ago
sent = 'He just finished his undergraduate degree in genealogy at a university known for its Greek life, Southern Axolotl University (SAU).'

triples = grammatical_triples.extract(sent, TripleExtractorOptions(prep_phrase = True))

for triple in triples:
    print(triple)

Returns:

He finished degree
He finished just
He finished at university

But should return:

He finished degree in genealogy
He finished at university known for greek life
Southern Axolotl University known for greek life # example of obj-verb-obj
stephbuon commented 1 year ago

To extract the object-verb-prep-object "Southern Axolotl University-known-for-greek life"

Identify the verb. If the verb's head is pobj, visit it.

If the verb's child is poa, visit it.

If the poa's child is pobj, visit it.

stephbuon commented 1 year ago

@stephbuon -- any other verb types?

stephbuon commented 1 year ago

@stephbuon

|He||||finished||at|||university||| |0 |He||||finished|||||degree||| |0 |He||||finished|||||just||| |0 |degree||||finish||at|||university||| |0 |degree||||finish|||||degree||| |0 |degree||||finish|||||just||| |0 |just||||finished||at|||university||| |0 |just||||finished|||||degree||| |0 |just||||finished|||||just||| |0 |He||||known||at|||university||| |0 |He||||known||for|||life||| |0 |university||||known||at|||univer