Open btyu opened 2 years ago
Looking forward to your reply if you get time! Thank you!
Hi, thanks for your interest and so sorry for the long wait! I was occupied with something else in the past few weeks. I looked at the error, and I think both errors are because of a bug in CogComp/SRL-English.
Specifically, the first error is actually at this line (about the iteration, not the overlap()
function):
text_piece = ' '.join([verb_srl_tokens[i] for i, tag in enumerate(res['tags']) if overlap(tag, srl_consts_for_trg)])
Here it's basically concatenating tokens in the input sentence if their SRL tag is in some pre-specified set (in the srl_consts
field in config.json
). So it expects that the number of tokens (verb_srl_tokens
) and the number of SRL tags (res['tags']
) should be the same. But they aren't the same for this example, if printed out:
print(len(verb_srl_tokens))
print(len(res["tags"]))
Output:
19
21
Both variables come from the SRL output. verb_srl_tokens
is the "words"
field:
"words": ["Jalal", "Jamil", ",", "a", "45-year", "-", "old", "jewellery", "store", "owner", ",", "said", "the", "situation", "just", "keeps", "getting", "worse", "."]
and res["tags"]
is the "tags"
field in the first element of the "verbs"
list ("verb": "situation"
):
"tags": ["B-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "I-ARG0", "B-V", "B-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "O"]
Our code expects that these two lists should be of the same length, but here it's not the case in the SRL output. That's what caused the error.
We apologize for the inconvenience, but it seems that CogComp/SRL-English has undergone some changes since we published our paper. Since it's not maintained by us, could you please raise this issue (basically, different lengths of "tags"
and "words"
in the output) in that repo directly?
Thanks for your understanding and please let me know if you have other questions.
I changed the SRL code. However, I am not sure if the change is correct, so I reported here the issue and modification I made.
Hi @evelinamorim, thanks for your interest! While waiting for the response from the SRL authors, you are welcome to look at the updated notes in our README on how to resolve a known inconsistency from the SRL system (please refer to "UPDATE (10/28/2022)" under "Getting the SRL output"), and see if this helps with your issue.
Hi! Thank you for your excellent work and the well-organized codebase.
I am trying to run up the inference pipeline, however it raises an IndexError from
source/utils/srl.py
. Also, I met another IndexError in the SRL process and skipped it by changing the SRL code. I am not sure whether the first one has something to do with the latter one, and also not sure whether I did the right way to obtain the SRL result. The following is the detail.IndexError in Event Extraction
I run
source/predict_evaluate.py
on the test set, and get the following IndexError:And this is the error line: https://github.com/veronica320/Zeroshot-Event-Extraction/blob/24bb003a31827f41367daf5cffe0b4521d741da3/source/utils/srl.py#L153
I guess it is a bug? For your convenience, these are the corresponding files that only contain the problematic sample sample.zip. I am not sure whether the problem is related to another IndexError I met in SRL that I will depict in the next section.
IndexError in SRL
I use the SRL code you refered to, and these are the commands to process the samples with
nominal_sense_srl
andverb_sense_srl
respectively.I am not quite sure whether the above commands are right, so please inform me if they are not. The
nominal_sense_srl
works well, and the other fails with the following IndexError:And this is the error line: https://github.com/CogComp/SRL-English/blob/33278a6590b9dd6652a7ed55cc313b42c3fb3f2a/verb_sense_srl/reader.py#L342
I noticed that the sample index is 378, which is not the same one that causes the IndexError in EE, so it is not likely that the first IndexError is related to this one. There are altogether three samples in the test set that cause the error. I skipped this error by setting
verb_index
andverb
toNone
if IndexError is raised, but not sure what side effect it will bring.Could you please check the possible bugs above? Thank you!