If the arff header contains a "categorical boolean" i.e. a nominal attribute with possible values {true, false} (or similar) then openml-python will convert it into bool when loading it into a dataframe. This in turn made AMLB write it to the split arff files as numeric, which could result in issues for frameworks relying on the split arff files produces by the benchmark (e.g., h2oautoml) especially when it was the target column. In the benchmark these are: kc1 (openml/t/3917), pc4 (openml/t/359958), and miniboone (openml/t/359990).
If the arff header contains a "categorical boolean" i.e. a nominal attribute with possible values
{true, false}
(or similar) thenopenml-python
will convert it intobool
when loading it into a dataframe. This in turn made AMLB write it to the split arff files asnumeric
, which could result in issues for frameworks relying on the split arff files produces by the benchmark (e.g., h2oautoml) especially when it was the target column. In the benchmark these are: kc1 (openml/t/3917), pc4 (openml/t/359958), and miniboone (openml/t/359990).