Open djstrong opened 4 years ago
It could theoretically be possible by adding an entry to jiant.tasks.retrieval.TASK_DICT
dynamically, but it is not currently a well-supported work-flow.
What task do you have in mind?
I have many datasets for text or token classification (tagging) in the same format (TSV) but with different labels. However, the labels are fixed in each task class.
I guess I could create TSVTextClassificationTask
and override TSVTextClassificationTask.LABELS
and TSVTextClassificationTask.LABEL_TO_ID, TSVTextClassificationTask.ID_TO_LABEL = labels_to_bimap(TSVTextClassificationTask.LABELS)
.
However, it doesn't solve the problem because labels are class members. So, labels should be defined in JSON or read from file. And the evaluation scheme also in JSON.
Thanks for your input, it helps us think about how we can refine our API. (Part of the difficulty arises from the distinction between tying tasks to datasets vs. tasks to formats)
For your use-case, I think a good approach would be:
jiant.tasks.retrieval.TASK_DICT
name
s but the same task
base on the TASK_DICT
key used above.Thank you. I have implemented it and it is working: https://github.com/djstrong/jiant/tree/tsv Labels and evaluation scheme I am providing in a task config.
"kwargs": {
"labels_path": f"{path}/labels.txt",
"evaluation_scheme": "SimpleAccuracyEvaluationScheme"
}
Is it possible to add tasks without editing the library code (dynamically by using Python)?