deepset-ai / FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
https://farm.deepset.ai
Apache License 2.0
1.73k stars 247 forks source link

Classification: is it possible to have targets/classes which are "missing values"? #827

Closed johann-petrak closed 2 years ago

johann-petrak commented 3 years ago

Is there a way to handle instances where the target is a "missing value"? So in addition to proper labels, maybe if the target is something like the empty string or some configurable value (e.g. "NA"), the head would treat this as a missing value, and exclude it from the calculation of the loss function.

This would be very useful in multitask-learning where not all target values are available for all tasks (I have data where this is the case and I need some way to train a multi-task model on that).

Originally I thought there could maybe a different way to achieve this by having a separate data column in the data which indiciates the tasks for which each example can be used and then creating batches that only contain examples for one task, going through all the tasks from batch to batch in a round-robin fashion.

But if there is a mechanism to represent "missing target" in some way, this may become easier and the missing target representation could e.g. be used to create batches that contain only examples with labels for a subset of all targets.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 21 days if no further activity occurs.