Open NixBiks opened 1 month ago
My current workaround is to create a custom span attribute and filter on that.
Span.set_extension("is_removed", default=False)
Instead of removing the object with del
I simply change that custom attribute span.doc.spans["my_spans"][index]._.is_removed = True
Thanks for the report. I'll have a look at this.
Is it possible you're removing an element during iteration? From the implementation this wouldn't be expected to work: https://github.com/explosion/spaCy/blob/master/spacy/tokens/span_group.pyx#L113
It's inside some iteration but I'm not iterating over the spans in the SpanGroup no. I've iterated earlier to find the index that needs to be removed though. I hope it makes sense.
I think it makes sense yeah. The pattern you want to avoid is something like:
for span in span_group:
if check_remove(span):
del span_group[span]
Deleting during iteration like that won't work, and of course precalculating an index and not adjusting it after you delete will cause problems too. But the latter one shouldn't cause an out-of-bounds access. Unless there's a bug in how we check the indexing, it should only be possible to get the out-of-bounds access from within the class's C methods.
These are the tests we have for the __del__
method, which look okay to me: https://github.com/explosion/spaCy/blob/master/spacy/tests/doc/test_span_group.py#L191
Usually if we have a memory access problem in the tests we do get a crash, because we've got a lot of tests that run in various orders. So it must be some sort of access pattern that's not working. Are you able to share any code?
How to reproduce the behaviour
Unfortunately I haven't been able to make a minimal reproducible example. The error happens sometimes when I run my tests - the more tests I run the higher probability of running into the following critical error
I've found out that I don't encounter the error if I remove the following line.
The documentation states this is the way to do it though. Note there is a typo in the code in the docs though.
I realise you most likely can't help with this little info but maybe you can give me some ways to investigate further.
Info about spaCy