IBM / unitxt

🦄 Unitxt: a python library for getting data fired up and set for training and evaluation
https://www.unitxt.ai
Apache License 2.0
161 stars 45 forks source link

Suggestion: Add notion of SubTask #694

Open elronbandel opened 8 months ago

elronbandel commented 8 months ago

Today we have tasks such as tasks.classification.mutli_class that are very general by using things like: class_type then we can have sentiment and emotion classification under the same task by using AddFields(fields={"class_type":"sentiment") for sentiment for example.

I suggest to add a concept of SubTask that is also a Task but can be made by filling fields of the Task it inherits from.

For example tasks.classification.mutli_class.sentiment_extraction:

SubTask(parent="tasks.classification.mutli_class", with={"class_type": "sentiment"})

That way you can enjoy both worlds: (1) have very general tasks with templates that can fit many tasks (2) when needed you can create specific template for specific tasks.

yoavkatz commented 8 months ago

I think different tasks have both different defaults and different field names. It's a question of whether adding a new concept will have a significant advantage over just reusing the basic building blocks (metrics, templates classes, post processors).

In general, I think tasks should have the options for default field names (e.g. "text_type": "text")