universal-ie / UIE

Unified Structure Generation for Universal Information Extraction
900 stars 99 forks source link

What does 'task' and 'source prefix' mean? #52

Closed ZhaoyueSun closed 2 years ago

ZhaoyueSun commented 2 years ago

What's the difference between setting the task as "meta" or "event"? Are there any other options? What are the options for 'source prefix'? What should the 'task' and 'source prefix' be when fine-tuning on a new dataset?

ZhaoyueSun commented 2 years ago

I'm currently setting the task as "meta" and the source prefix as "meta: " as readme, but when it calls DataCollatorForMetaSeq2Seq, the features passed to the data collator only include the following keys: ['input_ids', 'attention_mask', 'labels'] and the program triggers the error: KeyError, 'sample_prompt'.

I checked that the features included in the train_dataset after preprocessing are complete that includes 'sample_prompt' and other feature keys. Why are the features missed during training?

ZhaoyueSun commented 2 years ago

got it. we need to explicitly set "remove_unused_columns=False"