Use case: A classification model is returning a probability distribution. The data scientist wants to extract the probability of a specific class out of it, and apply further transformations to it ("decision engineering").
The probability distribution is returned as VectorUDT. It is possible to splice it into a one-element VectorUDT using ml.feature.VectorSlicer. However, most common transformer classes (eg. ml.feature.Bucketizer) refuse to accept vector as input.
The VectorToScalar pseudo-transformer class would simply unwrap a single-element vector to a scalar numeric value (ie. int, float or double). The data type of the output column can be manually overriden.
Use case: A classification model is returning a probability distribution. The data scientist wants to extract the probability of a specific class out of it, and apply further transformations to it ("decision engineering").
The probability distribution is returned as
VectorUDT
. It is possible to splice it into a one-elementVectorUDT
usingml.feature.VectorSlicer
. However, most common transformer classes (eg.ml.feature.Bucketizer
) refuse to accept vector as input.The
VectorToScalar
pseudo-transformer class would simply unwrap a single-element vector to a scalar numeric value (ie.int
,float
ordouble
). The data type of the output column can be manually overriden.