Can't reproduce confusion matrix of attention weights for TvSum video 7, test split 2

Hi,

I thought I would try my luck here...

I am working on your VASNet paper as part of a Deep Learning seminar. Since a few days I am trying to reproduce the Confusion Matrix of Attention Weights for TVSum video 7 from split 2, which is exactly the matrix you have shown in the paper.

Unfortunately I get a completely different figure, I'm running out of ideas. Here is my code:

self_attention = SelfAttention() # your SelfAttention(nn.Module)
self_attention.load_state_dict(torch.load(SELFATT_MODEL_FILE)) # SelfAttention model from split 2

with h5py.File(TVSUM_DATASET_FILE, 'r') as f:

    video_7 = f['video_7']
    features = video_7['features'][...]

    features = torch.from_numpy(features)

    y, weights = self_attention(features)

    weights = weights.detach().cpu().numpy()

    # Values were normalized to range 0-1 across the matrix.
    weights = preprocessing.minmax_scale(weights, feature_range=(0, 1), axis=0, copy=True)

    fig, ax = plt.subplots()
    ax.xaxis.tick_top()

    heatmap = sn.heatmap(
        weights_df,
        xticklabels=50,
        yticklabels=50,
        cmap="YlGnBu")

    plt.show()

And this is what the Confusion Matrix looks like:

confusion_matrix

It definitely loads the correct model for tvsum split 2 and the Attention Weights produced are definitely the same too, I checked it against your VASNet.

Does anyone have any idea why this looks like this?

ok1zjf / VASNet

Can't reproduce confusion matrix of attention weights for TvSum video 7, test split 2 #24