subject_verb_object_triples Enhancement

chartbeat-labs / textacy

NLP, before and after spaCy

Other

2.21k stars 249 forks source link

Could Textacy work on references(maybe coreferences also) and adjectives of subjects and objects, and adverb of verbs?

from __future__ import absolute_import, unicode_literals
from textacy import cache, extract

spacy_lang = cache.load_spacy('en')

text = """
"Sam and his fat friend are nice men that didn't hurt my child and sister"
"""

spacy_doc = spacy_lang(text.strip())

[', '.join(item.text for item in triple) for triple in \
 extract.subject_verb_object_triples(spacy_doc)]
Out[33]: 
['Sam, are, men',
 'friend, are, men',
 "that, didn't hurt, child",
 "that, didn't hurt, sister"]

EXPECTED OUTPUT: ['Sam, are, men', 'Sam's fat friend, are, men', "Sam and his fat friend, didn't hurt, my child", "Sam and his fat friend, didn't hurt, my sister"]

BETTER THIS, I DON'T KNOW IF IT'S EASILY POSSIBLE ['Sam, is, man', 'Sam's fat friend, is, man', "Sam and his fat friend, didn't hurt, my child", "Sam and his fat friend, didn't hurt, my sister"]

chartbeat-labs / textacy

subject_verb_object_triples Enhancement #198