VT-NLP / Event_Query_Extract

MIT License
25 stars 2 forks source link

Questions about arg_detection and zero-shot task #5

Closed xiezhiyu01 closed 1 year ago

xiezhiyu01 commented 1 year ago

Dear Author:

I ran into a few problems when trying to run this codebase. Could you help with the following questions? Thanks!

  1. In run_arg_detection.py Line8 and Line14, you imported Write2HtmlArg and FactContainer which are not included in the codebase.
  2. The preprocess code in save_dataset.py only provides the dataset for run_trigger_detection.py, but not for run_arg_detection.py. In run_arg_detection.py, there are 16 values to be unpacked from a batch, but only 7 is preprocessed. So I don't know how to generate the dataset for EAE task.
  3. I think there is a bug when you're preprocessing the arguments for each event. In https://github.com/VT-NLP/Event_Query_Extract/blob/main/preprocess/save_dataset.py#L256, if there is multiple events in a sentence, the arg_list only contains the arguments of the last one. Since the evaluation code of EAE isn't included in this codebase like mentioned in 1., I don't know how you manage to align multiple event mentions with only one arg_list in the end.
  4. The paper mentioned zero-shot EE, but zero-shot related code isn't included in the repository. Since zero-shot requires a different data format (in the paper, you said that the prototype triggers aren't included), I think adding the code for this part may help us reproduce the result.
  5. We need to add train.doc.txt,dev.doc.txt,test.doc.txt in data/splits/ACE05-E/ before running ./setup.sh, each file representing the document split of the corresponding dataset. The format of each file is that every line contains the document name. I think you could state that in README.md to help.
  6. The pos_tag ,time and value wouldn't appear if we stick to your preprocessing procedure in ACE_ERE_scripts repository. So if we want to run save_dataset.py successfully, we need to change _unpack_ace_with_vt like I mentioned in https://github.com/VT-NLP/Event_Query_Extract/issues/4.
sijiawang0221 commented 1 year ago

Thank you for your interests in our work!

  1. We update the argument evaluation in scripts/eval.py as the previous Write2HtmlArg is deprecated. 2. The argument detection data can be create with python scripts/eval.py --save_arg_pt. 3. Thank you for finding the bug in save_dataset.py. It has been fixed. 4. For zero-shot EE, as we mentioned in the paper, we use the event type name as the query so we use the names in trigger_representation.json when we construct the input sentences. 5. Thank you for the suggestion. 6. Replied in the related issue.