A problem about the model interpretability

BowenJIAN commented 1 year ago

I use this model to predict the crash label in a driving safety modeling dateset. Using this code, I can output the predicted labels, but cannot get the model interpretability results. I wonder if this function 'build_strats_interp()' in the origin codes can work? Thanks a lot!

Aminsn commented 1 year ago

Hi @BowenJIAN , I have written some code to extract the contribution score after fitting the I-STraTS model. But I have changed the 'build_strats_interp()' function a little bit by removing the "fore_op = Dense(V)(conc)" line because I think it was not consistent with the paper's math (Sindhu can correct me if I am wrong). So here is the code:

def build_strats_interp(D, max_len, V, d, N, he, dropout):

    demo = Input(shape=(D,))
    varis = Input(shape=(max_len,))
    values = Input(shape=(max_len,))
    times = Input(shape=(max_len,))
    varis_emb = Embedding(V+1, d)(varis)
    cve_units = int(np.sqrt(d))
    values_emb = CVE(cve_units, d)(values)
    times_emb = CVE(cve_units, d)(times)
    comb_emb = Add(name = 'comb_emb')([varis_emb, values_emb, times_emb]) 
    mask = Lambda(lambda x:K.clip(x,0,1))(varis) # b, L
    cont_emb = Transformer(N, he, dk=None, dv=None, dff=None, dropout=dropout)(comb_emb, mask=mask)
    attn_weights = Attention(2*d)(cont_emb, mask=mask)
    fused_emb = Lambda(lambda x:K.sum(x[0]*x[1], axis=-2))([comb_emb, attn_weights])
    conc = Concatenate(axis=-1)([fused_emb, demo])
    op = Dense(1, activation='sigmoid', use_bias=False, name = 'fore_op')(conc)
    model = Model([demo, times, values, varis], op)

    return model

fore_model_int =  build_strats_interp(fore_max_len, V, d, N, he, dropout)

# Extracting the contribution score for a single observation of the first sample
layer_names = [layer.name for layer in fore_model_int.layers]
attentions = [layer for layer in layer_names if 'attention' in layer][0] # Extract the attention layer's name
embeddings_layer_name = 'comb_emb'
last_layer_name = 'fore_op'

layers_outputs = [fore_model_int.get_layer(attentions).output, \
                fore_model_int.get_layer(embeddings_layer_name).output]

get_alpha = Model(inputs=fore_model_int.input, outputs=layers_outputs[0])
get_e = Model(inputs=fore_model_int.input, outputs=layers_outputs[1])
w = fore_model_int.get_layer(last_layer_name).get_weights()[0]

alpha = get_alpha.predict([x[0:1] for x in all_triplets]) # Just the 1st sample
e = get_e.predict([x[0:1] for x in all_triplets]) # Just the 1st sample

i = 0 # observation index
contribution_score = np.dot(w[2:,0], e[0,i,:])*alpha[0,i,0] # contribution score of the first observation

Disclaimer: I am sure there should be a better way to do this (Sindhu can give us a better solution).

BowenJIAN commented 1 year ago

Hi @BowenJIAN , I have written some code to extract the contribution score after fitting the I-STraTS model. But I have changed the 'build_strats_interp()' function a little bit by removing the "fore_op = Dense(V)(conc)" line because I think it was not consistent with the paper's math (Sindhu can correct me if I am wrong). So here is the code:

def build_strats_interp(D, max_len, V, d, N, he, dropout):

    demo = Input(shape=(D,))
    varis = Input(shape=(max_len,))
    values = Input(shape=(max_len,))
    times = Input(shape=(max_len,))
    varis_emb = Embedding(V+1, d)(varis)
    cve_units = int(np.sqrt(d))
    values_emb = CVE(cve_units, d)(values)
    times_emb = CVE(cve_units, d)(times)
    comb_emb = Add(name = 'comb_emb')([varis_emb, values_emb, times_emb]) 
    mask = Lambda(lambda x:K.clip(x,0,1))(varis) # b, L
    cont_emb = Transformer(N, he, dk=None, dv=None, dff=None, dropout=dropout)(comb_emb, mask=mask)
    attn_weights = Attention(2*d)(cont_emb, mask=mask)
    fused_emb = Lambda(lambda x:K.sum(x[0]*x[1], axis=-2))([comb_emb, attn_weights])
    conc = Concatenate(axis=-1)([fused_emb, demo])
    op = Dense(1, activation='sigmoid', use_bias=False, name = 'fore_op')(conc)
    model = Model([demo, times, values, varis], op)

    return model

fore_model_int =  build_strats_interp(fore_max_len, V, d, N, he, dropout)

# Extracting the contribution score for a single observation of the first sample
layer_names = [layer.name for layer in fore_model_int.layers]
attentions = [layer for layer in layer_names if 'attention' in layer][0] # Extract the attention layer's name
embeddings_layer_name = 'comb_emb'
last_layer_name = 'fore_op'

layers_outputs = [fore_model_int.get_layer(attentions).output, \
                fore_model_int.get_layer(embeddings_layer_name).output]

get_alpha = Model(inputs=fore_model_int.input, outputs=layers_outputs[0])
get_e = Model(inputs=fore_model_int.input, outputs=layers_outputs[1])
w = fore_model_int.get_layer(last_layer_name).get_weights()[0]

alpha = get_alpha.predict([x[0:1] for x in all_triplets]) # Just the 1st sample
e = get_e.predict([x[0:1] for x in all_triplets]) # Just the 1st sample

i = 0 # observation index
contribution_score = np.dot(w[2:,0], e[0,i,:])*alpha[0,i,0] # contribution score of the first observation

Disclaimer: I am sure there should be a better way to do this (Sindhu can give us a better solution).

Thanks a lot for your help. I've already run this code in my dataset and it successfully works, but I've just got a question about the code below:

contribution_score = np.dot(w[2:,0], e[0,i,:])*alpha[0,i,0] # contribution score of the first observation

As 'demo' is input after 'fused_emb' in the Concatenate Layer, contribution_socre may be calulated by:

contribution_score = np.dot(w[:-2,0], e[0,i,:])*alpha[0,i,0] # contribution score of the first observation

Please corret me if I'm wrong.

Aminsn commented 1 year ago

@BowenJIAN Your's should be correct. I don't know why I wrote it in reverse order.

sindhura97 / STraTS

A problem about the model interpretability #2