Closed ZhaoyueSun closed 2 years ago
I'm currently setting the task as "meta" and the source prefix as "meta: " as readme, but when it calls DataCollatorForMetaSeq2Seq, the features passed to the data collator only include the following keys: ['input_ids', 'attention_mask', 'labels'] and the program triggers the error: KeyError, 'sample_prompt'.
I checked that the features included in the train_dataset after preprocessing are complete that includes 'sample_prompt' and other feature keys. Why are the features missed during training?
got it. we need to explicitly set "remove_unused_columns=False"
What's the difference between setting the task as "meta" or "event"? Are there any other options? What are the options for 'source prefix'? What should the 'task' and 'source prefix' be when fine-tuning on a new dataset?