marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.41k stars 1.79k forks source link

How to use LIME when we have mixed feature(Text and tabular features both)? #679

Open vishalkumarbit opened 2 years ago

vishalkumarbit commented 2 years ago

I am trying to do error analysis/model interpretation using LIME and my model(deep learning model) is trained on mixed features(Text and tabular features). My model takes 3 inputs(2 text tokenized padded features and other tabular features).

Please find the code of model training code below:

input_layer_comment = Input(shape=(50,), name='input_layer_comment') embedding_layer_comment = Embedding(len(token_comment.word_index)+1,300,weights=[embedding_matrix_comment_fast],input_length=50,trainable=True, name='Emb_layer_comment')(input_layer_comment) dropout_layer_comment = Dropout(0.4, name='dropout_layer_comment')(embedding_layer_comment)

first_cov_comment = Conv1D(256, 3, activation='relu',padding="same", name='first_cov_comment', kernel_initializer=glorot_uniform())(dropout_layer_comment) first_maxpool_comment = MaxPooling1D(pool_size=2, padding='same', name='first_maxpool_comment')(first_cov_comment) second_cov_comment = Conv1D(128, 3, activation='relu',padding="same", name='second_cov_comment', kernel_initializer=glorot_uniform())(first_maxpool_comment) second_maxpool_comment = MaxPooling1D(pool_size=2, padding='same', name='second_maxpool_comment')(second_cov_comment) flat_layer_comment = Flatten(name='Flat_layer_comment')(second_maxpool_comment)

input_layer_pcomment = Input(shape=(500,), name='input_layer_pcomment') embedding_layer_pcomment = Embedding(len(token_pcomment.word_index)+1,300,weights=[embedding_matrix_pcomment_fast],input_length=500,trainable=True, name='Emb_layer_pcomment')(input_layer_pcomment) dropout_layer_pcomment = Dropout(0.4, name='dropout_layer_pcomment')(embedding_layer_pcomment)

first_cov_pcomment = Conv1D(256, 3, activation='relu',padding="same", name='first_cov_pcomment', kernel_initializer=glorot_uniform())(dropout_layer_pcomment) first_maxpool_pcomment = MaxPooling1D(pool_size=2, padding='same', name='first_maxpool_pcomment')(first_cov_pcomment) second_cov_pcomment = Conv1D(128, 3, activation='relu',padding="same", name='second_cov_pcomment', kernel_initializer=glorot_uniform())(first_maxpool_pcomment) second_maxpool_pcomment = MaxPooling1D(pool_size=2, padding='same', name='second_maxpool_pcomment')(second_cov_pcomment) flat_layer_pcomment = Flatten(name='Flat_layer_pcomment')(second_maxpool_pcomment)

otherthan_text_input = Input(shape=(15,1,), name='Otherthan_text_input') conv_layer_1 = Conv1D(256, 3, activation='relu',padding="same", kernel_initializer=glorot_uniform(), name='conv_layer_1', kernel_regularizer=l2(0.01))(otherthan_text_input) max_pool_1 = MaxPooling1D(pool_size=2, padding='same', name='max_pool_1')(conv_layer_1) conv_layer_2 = Conv1D(128, 3, activation='relu',padding="same", kernel_initializer=glorot_uniform(), name='conv_layer_2', kernel_regularizer=l2(0.01))(max_pool_1) conv_layer_3 = Conv1D(64, 3, activation='relu',padding="same", kernel_initializer=glorot_uniform(), name='conv_layer_3', kernel_regularizer=l2(0.01))(conv_layer_2) max_pool_2 = MaxPooling1D(pool_size=2, padding='same', name='max_pool_2')(conv_layer_3) conv_layer_4 = Conv1D(32, 3, activation='relu',padding="same", kernel_initializer=glorot_uniform(), name='conv_layer_4', kernel_regularizer=l2(0.01))(max_pool_2)

flattened_layer_2 = Flatten(name= 'flattened_layer_2')(conv_layer_4)

merged_layer = Concatenate(axis=-1)([flat_layer_comment,flat_layer_pcomment,flattened_layer_2])

dense_layer1 = Dense(512, activation='relu', kernel_initializer='he_normal', name='dense_layer1', kernel_regularizer=l2(0.01))(merged_layer) batch_norm_1 = BatchNormalization()(dense_layer1) drop_layer1 = Dropout(0.6)(batch_norm_1) dense_layer2 = Dense(256, activation='relu', kernel_initializer='he_normal', name='dense_layer2', kernel_regularizer=l2(0.01))(drop_layer1) batch_norm_2 = BatchNormalization()(dense_layer2) drop_layer2 = Dropout(0.3)(batch_norm_2) dense_layer3 = Dense(128, activation='relu', kernel_initializer='he_normal', name='dense_layer3', kernel_regularizer=l2(0.01))(drop_layer2)

output_layer = Dense(1, activation='sigmoid')(dense_layer3)

logdir="logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

earlystop = EarlyStopping(monitor = 'val_accuracy', mode="max",min_delta = 0, patience = 3,verbose = 1) reduce_lr = ReduceLROnPlateau(monitor = 'val_accuracy', factor = 0.8, patience = 1, verbose = 1)

optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

model = Model(inputs=[input_layer_comment, input_layer_pcomment, otherthan_text_input], outputs= output_layer)

model.summary()

Please help me with how to use LIME here.