keras-team / autokeras

AutoML library for deep learning
http://autokeras.com/
Apache License 2.0
9.13k stars 1.4k forks source link

Easier way to get the class names in classification #1743

Open leandroimail opened 2 years ago

leandroimail commented 2 years ago

I have a problem with a multiclass predict model.

I make my custom Automodel with an automatic categorical_encoding like the follow code:

            input_node = ak.StructuredDataInput()
            output_node = ak.StructuredDataBlock(categorical_encoding=True)(input_node)
            output_node = ak.ClassificationHead()(output_node)
            estimator = ak.AutoModel(inputs=input_node, outputs=output_node, overwrite=True, max_trials=self.config.DEEP_MODEL_PARAM["max_trials"],)

        estimator.fit(x=train_features, 
                        y=train_label, 
                        validation_split = self.config.DEEP_MODEL_PARAM["percent_validation_size"],
                        epochs=epochs,
                        verbose=0)
        self.model = estimator.export_model()

when I use the model to predict I get an array of probabilities, but I don't know which position in the array refers to each class of my target.

y_prob = self.model.predict(test_features)

the result is an array of array, for example:

[0.18029907, 0.41025335, 0.40944752]

In this example, I used Iris dataframe to test.

But, How do I know if the zero position of my probability array is an Iris-verginica, Iris-setosa or an Iris-visicolor?

Where can I get the relation of the position of the array with the name of each class?

haifeng-jin commented 2 years ago

It should be something similar to the following:

auto_model.tuner.hyper_pipeline.outputs[0].preprocessor.labels

It corresponds to this attribute: https://github.com/keras-team/autokeras/blob/8e128ca7f9ca6f9efb7276be0262c53bd4b279ed/autokeras/preprocessors/encoders.py#L30 It should be a list of strings corresponding the class labels of the probabilities. Please let me know if it works. We may have a more elegent way to get this information in the future.

leandroimail commented 2 years ago

@haifeng-jin , thank you for your help.

Almost worked your solution. I had to add another level of the index array like the following code:

auto_model.tuner.hyper_pipeline.outputs[0][0].preprocessor.labels When you make a more elegant way to fix this problem, please, let me know.

Thank again for your help.