Closed fakerybakery closed 5 months ago
I could see breaking ties using proper nouns (if available) being a useful modification
On Thu, Jan 4, 2024 at 5:19 PM mrfakename @.***> wrote:
Hi, Thanks for this tool. I noticed that sometimes coref doesn't use the proper noun, is there any way to make it use the proper noun? Here is my code (wip):
import stanzapipe = stanza.Pipeline("en", processors="tokenize,coref")t = pipe('"I am doing this," John said. He did it.') final = []nouns = []for sente in t.to_dict(): sent = [] exclude_ids = [] for word in sente: if not word['id'] in exclude_ids: if type(word['id']) == tuple: exclude_ids += word['id'] if "coref_chains" in word and type(word['coref_chains'] == list): if (word['coref_chains']) and not word['coref_chains'][0].is_representative: print(word['coref_chains'][0].to_json()) sent.append(word['coref_chains'][0].chain.representative_text) else: sent.append(word['text']) else: sent.append(word['text']) sent = [item.strip() for item in sent if item and item.strip()] x = '' for i in sent: if i in ['.', ',', '?', ';', ':']: x += i else: x += ' ' + i if sent: final.append(x.strip()) print(' '.join(final))
Output: " I am doing this, " I said. I did this. It should be: " John am doing this, " John said. John did this. Thank you!
— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/stanza/issues/1326, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWI4IYPW6F5NCVYJWYTYM5IKFAVCNFSM6AAAAABBNX2MAGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DMNJYGIZTGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
This is now part of the 1.8.2 release
Hi, Thanks for this tool. I noticed that sometimes coref doesn't use the proper noun, is there any way to make it use the proper noun? Here is my code (wip):
Output:
" I am doing this, " I said. I did this.
It should be:" John am doing this, " John said. John did this.
Thank you!