When performing the binarization of categorical features (eg. using LabelBinarizer) instead of integer-encoding them (eg. using LabelEncoder), then splits of categorical values are encoded as double comparisons against a reference value 1.0000000180025095E-35 (the smallest 64-bit value that is still greater than 0):
When performing the binarization of categorical features (eg. using
LabelBinarizer
) instead of integer-encoding them (eg. usingLabelEncoder
), then splits of categorical values are encoded as double comparisons against a reference value1.0000000180025095E-35
(the smallest 64-bit value that is still greater than0
):It would be much more transparent and space efficient to encode the same as integer comparisons against
0
and1
reference values:and/or: