rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.86k stars 857 forks source link

Unable to apply SFS to a sklearn-wrapped Keras classifier neuralnet #771

Closed leowang396 closed 3 years ago

leowang396 commented 3 years ago

Hi there, I would really appreciate any help or pointers on this issue.

I'm attempting to leverage on SFS to select the minimal suitable list of features for my 3-class classification neuralnet, and I'm using KerasClassifier as the wrapper. Currently, I would need some help to figure out why my input data does not seem to reach the first layer of the network in the right shape, which is causing a ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 4 but received input with shape [None, 1]. I've tried to address this by aligning input_shape of the first layer to train_input.shape[1], but it still doesn't seem to work.

I looked into the error message as much as I could, and it seems that the width of input is changed to 1 during processing, maybe that's causing the error, but I do not know how to fix it... (At the start, input has 4 columns) image (After a few steps, the input seems to have only 1 column) image

The error is recreated below using iris dataset from sklearn:

from keras.wrappers.scikit_learn import KerasClassifier
from mlxtend.feature_selection import SequentialFeatureSelector as SFS
import sklearn
from sklearn.datasets import load_iris
from tensorflow import keras
iris_train, iris_test = load_iris(return_X_y=True, as_frame=True)

EPOCHS_IRIS = 100
BATCH_SIZE_IRIS = 16

def create_model(train_input):
    # 1 ReLU layer + 1 Dropout layer + 1 softmax layer for 3 classes
    model = keras.Sequential([
        keras.layers.Dense(16,
                           activation='relu',
                           input_shape=((train_input.shape[1]),)),
        keras.layers.Dropout(0.5),
        keras.layers.Dense(3,
                           activation='softmax')
    ])

    model.compile(optimizer=keras.optimizers.Adam(lr=1e-3),
                  loss=keras.losses.SparseCategoricalCrossentropy)

    return model

keras.backend.clear_session()
# Wrap Keras nn and generating SFS object
skwrapped_model = KerasClassifier(build_fn=create_model,
                                  train_input=iris_train,
                                  epochs=EPOCHS_IRIS,
                                  batch_size=BATCH_SIZE_IRIS,
                                  validation_split=1-TRAIN_TEST_SPLIT,
                                  verbose=0)
sffs = SFS(skwrapped_model,
           k_features=(1, iris_train.shape[1]),
           floating=True,
           clone_estimator=False,
           cv=0,
           n_jobs=1,
           scoring='accuracy')

# Apply SFS to identify best feature subset
sffs = sffs.fit(iris_train,
                iris_test)

Help is much appreciated, thank u!