salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.25k stars 393 forks source link

Is it possible to do multi-label classification with TransmogrifAI? #145

Open onema opened 6 years ago

onema commented 6 years ago

Problem This is more a question, and a feature request if the answer to the question is no.

Is it possible to do multi-label classification with TransmogrifAI?

Solution I should be able to generate models that map features to vectors by assigning 1 or 0 to each element of the vector.

Alternatives NA

Additional context Note that this is different from multi-class classification (the Iris example)

Multi-class classification

makes the assumption that each sample is assigned to one and only one label: a fruit can be either an apple or a pear but not both at the same time [1]

In the other hand multi-label classification

can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document. A text might be about any of religion, politics, finance or education at the same time or none of these. [1]

References:

  1. Multiclass and multilabel algorithms
  2. Multi-label classification
leahmcguire commented 6 years ago

Thanks for the question! We don't currently have good support for this. Right now the only solution is to make a separate model or model selector for each label. This will then generate a prediction per label from which you can extract the 1 or 0 prediction for each label. We should definitely add better support for this and will add it to the backlog of features.

onema commented 6 years ago

@leahmcguire, thank you for your response! Looking forward to future improvements.