dwadden / dygiepp

Span-based system for named entity, relation, and event extraction.
MIT License
575 stars 120 forks source link

parse_ace_event.py cannot be executed correctly #74

Closed ws-researcher closed 3 years ago

ws-researcher commented 3 years ago

parse_ace_event.py cannot be executed correctly

dwadden commented 3 years ago

Can you please provide the command that you're running to cause the error, as well as a full stack trace showing the error message?

ws-researcher commented 3 years ago

When I execute the command 'python ./scripts/data/ace-event/parse_ace_event.py default-settings': image It looks like after sent.as_doc(), the id of the token has changed, but the id of the entity has not changed, causing ‘start_token = [tok for tok in sent if tok.idx == entity.start_char]’ to be empty.

dwadden commented 3 years ago

I was able to run the script without error. This is a bit tricky to debug, since the ACE distribution isn't public. Let's try this:

Sorry I can't help more.

scanf3 commented 3 years ago

I also encountered this problem when I used spacy's en_core_web_md model. And the problem seemed to disappear when I used en_core_web_sm instead.

dwadden commented 3 years ago

Interesting, thanks @scanf3 for pointing this out! I just updated the README and the ACE preprocessing code to explicitly use en_core_web_sm; see this pull request.

@ws-researcher, if this resolves your problems, feel free to close this issue.

ws-researcher commented 3 years ago

Thanks for pointing this out!. @scanf3 @dwadden