VT-NLP / Event_Query_Extract

MIT License
25 stars 2 forks source link

About Time and Value arguments #4

Open zyf-zone opened 2 years ago

zyf-zone commented 2 years ago

Dear Author:

When i use the file you provided to process the data i got this error:

Traceback (most recent call last):  File "./preprocess/save_dataset.py", line 305, in   read_from_source(args)  File "./preprocess/save_dataset.py", line 289, in read_from_source   raw_data = read_data_from(path, config.tokenizer, ace=args.ace)  File "F:\Event_Query_Extract-main.\utils\data_to_dataloader.py", line 120, in read_data_from   output = list(map(partial(_unpack_ace_withvt, tokenizer=tokenizer), output))  File "F:\Event_Query_Extract-main.\utils\data_to_dataloader.py", line 60, in _unpack_ace_with_vt   entities = data[-5].split(' ') IndexError: list index out of range

I checked the code and guessed the possible cause of the problem After run python preprocess/ace/read_args_with_entity_ace.py I found the content of the output file as follows:

'Text' 'B-I-O TAG' 'xxxxxxxxxxx.csv'

It is seems to contradict function _unpack_ace_with_vt Where is the Time and Value arguments and where is the POS-TAG? What should I do?

xiezhiyu01 commented 1 year ago

Hi, I also encountered this error. It seems that the Time, Value and POS-TAG here are all ignored in save_data.py afterwards, so I changed _unpack_ace_with_vt to this, and I think it works

def _unpack_ace_with_vt(data, tokenizer):
    """
    read the data with pos tags
    :param data:
    :param tokenizer:
    :return:
    """
    n = len(data)
    sentence = data[0].split()
    bert_words = tokenizer.tokenize(data[0])
    triggers, arguments = [],  []
    if n > 3:
        mid = (n-3) // 4 + 1
        triggers = [x.split(' ') for x in data[1:mid]]
        arguments = [x.split(' ') for x in data[mid:-2]]
        trigger_count[0] += len(triggers)

    entities = data[-2].split(' ')
    doc_id = data[-1].split(' ')

    return sentence, bert_words, triggers, arguments, entities, None, None, None, doc_id
sijiawang0221 commented 1 year ago

Hi @zyf-zone, thank you for your interest in our work! Thank you for bring this issue up! In our work, we ignore value and time argument roles, which can be set when we call the read data function at https://github.com/VT-NLP/Event_Query_Extract/blob/6a1ed5a682912a2882e4bf92d71e9b696d71d207/utils/data_to_dataloader.py#L105. The with_vt argument is for the purpose of including value and time arguments or not. We'll fix the issue shortly. Also thank @xiezhiyu01 for your help!