jerbarnes / semeval22_structured_sentiment

SemEval-2022 Shared Task 10: Structured Sentiment Analysis
75 stars 42 forks source link

Error processing mpqa #1

Closed MinionAttack closed 3 years ago

MinionAttack commented 3 years ago

Hi,

I'm trying to do the Step 1 but I'm getting this error for the MPQA 2.0 corpus:

2021-09-07 14:50:30 INFO: Loading these models for language: en (English):
========================
| Processor | Package  |
------------------------
| tokenize  | combined |
========================

2021-09-07 14:50:30 INFO: Use device: gpu
2021-09-07 14:50:30 INFO: Loading: tokenize
2021-09-07 14:50:38 INFO: Done loading processors!
  0%|                                                                                                                                                                           | 0/287 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "process_mpqa.py", line 355, in <module>
    main()
  File "process_mpqa.py", line 343, in main
    new = process_file(fname, nlp)
  File "process_mpqa.py", line 315, in process_file
    sents = get_sents(text, fname, nlp)
  File "process_mpqa.py", line 189, in get_sents
    sent_bidx = int(sentence.tokens[0].misc.split("|")[0].split("=")[1])
AttributeError: 'NoneType' object has no attribute 'split'

For the Darmstadt Service Review Corpus I have no problems.

jerbarnes commented 3 years ago

Looks like it was due to old stanza. Updated the code to work for >=1.2.3. Let me know if it works for you and if so, I'll close the issue.

MinionAttack commented 3 years ago

Hi, I'm still having the same problem even with stanza==1.2.3.

jerbarnes commented 3 years ago

Have you pulled the changes? If so, what error do you get now, because it shouldn't be possible to get the same error as before.

MinionAttack commented 3 years ago

Oops, sorry, I forgot to pull the changes! What a rookie mistake.