Closed salokr closed 5 months ago
You can find how we preprocessed ACE here: bash_scripts/preprocess_ace.sh
In principle that error arise because there is an instance of Movement:Transport
event that has an argument of type Person
, which is not defined in the guidelines. We did not face such an error when preprocessing ACE.
We did in fact some small adaptations to remove inconsistencies from annotations:
# Inconsistency between data and annotation guideline argument names
arg_name_mapping = {
"ATTACK": {"Victim": "Target", "Agent": "Attacker"},
"APPEAL": {"Plaintiff": "Prosecutor"},
"PHONE-WRITE": {"Place": None},
}
Can you provide the id of the event that has that event-argument combination?
Hi,
Thank you for the swift response. I tried running the [bash_scripts/preprocess_ace.sh](https://github.com/hitz-zentroa/GoLLIE/issues/bash_scripts/preprocess_ace.sh)
file but I get the following error:
src/dataset/ace_2005/data/**/timex2norm/*.sgm
src/dataset/ace_2005/data
Converting the dataset to JSON format
#SGM files: 0
0it [00:00, ?it/s]
Converting the dataset to OneIE format
Splitting the dataset into train/dev/test sets
Traceback (most recent call last):
File "src/tasks/ace/preprocess_ace.py", line 1183, in <module>
split_data(sentence_path, args.output, args.split)
File "src/tasks/ace/preprocess_ace.py", line 1136, in split_data
with open(os.path.join(split_path, "train.doc.txt")) as r:
FileNotFoundError: [Errno 2] No such file or directory: 'data/ace05/splits/train.doc.txt'
from where I can find the split files?
You can download the splits from the repo for the OneIE paper, here. Our preprocessing script is the same as theirs with minor tweaks.
nvm, I solved the issue. Thanks for the help and quick suggestions :) I will close the issue now.
Hi, Thank you for uploading your code and awesome work to Git.
I have downloaded the ACE'05 dataset and would like to generate the code representation for it. Following your suggestions, I ran the following:
python preprocess_ace.py -i <path_to_raw_ace_files> -o <output_dir> -s <path_to_ACE05-E>
However, there were some issues with this so I made the following changes (line 915)(line 1171)
after which I was able to run your code and get the following three files in the output directory:
My first question: are the steps followed above correct?
If they are, next I run the following code (because I just want to run for the ACE dataset)
but I get the following errors:
How can I resolve this error?