explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.32k stars 4.41k forks source link

Since 3.4.2 doc.to_json(['noun_chunks']) does not work anymore #11688

Closed acassaignemondeca closed 2 years ago

acassaignemondeca commented 2 years ago

How to reproduce the behaviour

This code works in 3.4.1 but not in 3.4.2:

import spacy
from spacy.tokens import Doc
np_getter = lambda doc: [{'start': chunk.start, 'end': chunk.end} for chunk in doc.noun_chunks]
Doc.set_extension("noun_chunks", getter=np_getter, force=True)
nlp = spacy.load("en_core_web_trf")
doc = nlp("I like New York")
doc._.noun_chunks
doc.to_json(['noun_chunks'])

In 3.4.2, error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "spacy/tokens/doc.pyx", line 1703, in spacy.tokens.doc.Doc.to_json
ValueError: [E106] Can't find `doc._.noun_chunks` attribute specified in the underscore settings: ['noun_chunks']

Related to commit: Fix multiple entries per custom extension in doc json #11551

Thank you

polm commented 2 years ago

Thanks for the report, that does look like a bug! We'll be right on fixing it.

github-actions[bot] commented 1 year ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.