FTransUNet：关于测试的问题；如何绘制热力图

yingzhige00 commented 6 months ago

在FTransUNet中，为什么在测试部分设置的Stride_Size只有32，导致测试部分过慢。但是在写的test函数中默认stride=WINDOW_SIZE[0]，如果按照默认的设置，测试速度会变快，但是滑窗会变少，是否会影响最后的测试结果呢？

sstary commented 6 months ago

你好，因为训练过程中的测试比较频繁，为了更快地测试，在较大的数据集（LoveDA）上训练及测试时，可以使用较大的滑窗尺寸128，这种情况下测试出来的模型性能同样具有指示性，而滑窗32作为最终的测试，可以得到一个更高的指标。

yingzhige00 commented 5 months ago

你好，对于模型的训练与测试现在都十分顺利，但是我看到您的论文中有关于注意力热力图的可视化图像，但是我实在是做不出来，您能分享下您的注意力图可视化部分的代码吗？十分感谢！

sstary commented 5 months ago

你好，请将模型调用处改为：

    heatmaps = []
    x, y, attn_weights, features, cnn_x = self.transformer(x, y)  # (B, n_patch, hidden)
    heatmaps.append(cnn_x)
    x = x + y
    x, trans_x = self.decoder(x, features)
    heatmaps.append(trans_x)
    heatmaps.append(x)
    logits = self.segmentation_head(x)
    pred = logits[:, 3, 100, 65]

    ## heatmap
    feature = heatmaps[0]
    feature_grad = autograd.grad(pred, feature, allow_unused=True, retain_graph=True)[0]
    grads = feature_grad  # 获取梯度
    pooled_grads = torch.nn.functional.adaptive_avg_pool2d(grads, (1, 1))
    # 此处batch size默认为1，所以去掉了第0维（batch size维）
    pooled_grads = pooled_grads[0]
    feature = feature[0]
    # print("pooled_grads:", pooled_grads.shape)
    # print("feature:", feature.shape)
    # feature.shape[0]是指定层feature的通道数
    for i in range(feature.shape[0]):
        feature[i, ...] *= pooled_grads[i, ...]
    heatmap = feature.detach().cpu().numpy()
    heatmap = np.mean(heatmap, axis=0)
    heatmap1 = np.maximum(heatmap, 0)
    heatmap1 /= np.max(heatmap1)

    feature = heatmaps[1]
    feature_grad = autograd.grad(pred, feature, allow_unused=True, retain_graph=True)[0]
    grads = feature_grad  # 获取梯度
    pooled_grads = torch.nn.functional.adaptive_avg_pool2d(grads, (1, 1))
    # 此处batch size默认为1，所以去掉了第0维（batch size维）
    pooled_grads = pooled_grads[0]
    feature = feature[0]
    # print("pooled_grads:", pooled_grads.shape)
    # print("feature:", feature.shape)
    # feature.shape[0]是指定层feature的通道数
    for i in range(feature.shape[0]):
        feature[i, ...] *= pooled_grads[i, ...]
    heatmap = feature.detach().cpu().numpy()
    heatmap = np.mean(heatmap, axis=0)
    heatmap2 = np.maximum(heatmap, 0)
    heatmap2 /= np.max(heatmap2)

    feature = heatmaps[2]
    feature_grad = autograd.grad(pred, feature, allow_unused=True, retain_graph=True)[0]
    grads = feature_grad  # 获取梯度
    pooled_grads = torch.nn.functional.adaptive_avg_pool2d(grads, (1, 1))
    # 此处batch size默认为1，所以去掉了第0维（batch size维）
    pooled_grads = pooled_grads[0]
    feature = feature[0]
    # print("pooled_grads:", pooled_grads.shape)
    # print("feature:", feature.shape)
    # feature.shape[0]是指定层feature的通道数
    for i in range(feature.shape[0]):
        feature[i, ...] *= pooled_grads[i, ...]
    heatmap = feature.detach().cpu().numpy()
    heatmap = np.mean(heatmap, axis=0)
    heatmap3 = np.maximum(heatmap, 0)
    heatmap3 /= np.max(heatmap3)

    return logits, heatmap1, heatmap2, heatmap3

train.py中改为：

        # Do the inference
        outs, heatmap1, heatmap2, heatmap3 = net(image_patches, dsm_patches)
        outs = outs.data.cpu().numpy()
        image_patches = np.asarray(255 * torch.squeeze(image_patches).cpu(), dtype='uint8').transpose((1, 2, 0))
        gt_patches = np.asarray(torch.squeeze(gt_patches).cpu(), dtype='uint8').transpose((1, 2, 0))
        heatmap1 = cv2.resize(heatmap1, (256, 256))
        # heatmap[heatmap < 0.7] = 0
        heatmap1 = np.uint8(255 * heatmap1)
        heatmap1 = cv2.applyColorMap(heatmap1, cv2.COLORMAP_JET)
        heatmap1 = heatmap1[:, :, (2, 1, 0)]
        heatmap2 = cv2.resize(heatmap2, (256, 256))
        # heatmap[heatmap < 0.7] = 0
        heatmap2 = np.uint8(255 * heatmap2)
        heatmap2 = cv2.applyColorMap(heatmap2, cv2.COLORMAP_JET)
        heatmap2 = heatmap2[:, :, (2, 1, 0)]
        heatmap3 = cv2.resize(heatmap3, (256, 256))
        # heatmap[heatmap < 0.7] = 0
        heatmap3 = np.uint8(255 * heatmap3)
        heatmap3 = cv2.applyColorMap(heatmap3, cv2.COLORMAP_JET)
        heatmap3 = heatmap3[:, :, (2, 1, 0)]
        x_comp = 65
        y_comp = 100
        fig = plt.figure()
        fig.add_subplot(1, 5, 1)
        plt.imshow(image_patches)
        # plt.title('CFNet', y=-0.1)
        plt.axis('off')

        plt.gca().add_patch(plt.Rectangle((x_comp - 2, y_comp - 2), 2, 2, color='red', fill=False, linewidth=1))

        fig.add_subplot(1, 5, 2)
        plt.imshow(heatmap1)
        # heatmap_str = './CFNet_features' + str(featureid) + '.jpg'
        # cv2.imwrite(heatmap_str, heatmap1)
        plt.gca().add_patch(plt.Rectangle((x_comp - 2, y_comp - 2), 2, 2, color='red', fill=False, linewidth=1))
        plt.axis('off')
        fig.add_subplot(1, 5, 3)
        plt.imshow(heatmap2)
        # heatmap_str = './CFNet_features' + str(featureid+1) + '.jpg'
        # cv2.imwrite(heatmap_str, heatmap2)
        plt.gca().add_patch(plt.Rectangle((x_comp - 2, y_comp - 2), 2, 2, color='red', fill=False, linewidth=1))
        plt.axis('off')
        fig.add_subplot(1, 5, 4)
        plt.imshow(heatmap3)
        # heatmap_str = './CFNet_features' + str(featureid+1) + '.jpg'
        # cv2.imwrite(heatmap_str, heatmap2)
        plt.gca().add_patch(plt.Rectangle((x_comp - 2, y_comp - 2), 2, 2, color='red', fill=False, linewidth=1))
        plt.axis('off')

        fig.add_subplot(1, 5, 5)
        plt.imshow(gt_patches)
        plt.gca().add_patch(plt.Rectangle((x_comp - 2, y_comp - 2), 2, 2, color='red', fill=False, linewidth=1))
        clear_output()
        plt.axis('off')

        plt.show()
        # plt.savefig('heatmap.png', dpi=1200)
        plt.savefig('heatmap_f_tree'+str(index)+'.pdf', dpi=1200)

索引点的类别和位置由：pred = logits[:, 3, 100, 65] 这一行代码确定，绘制时将batch_size改为1。

yingzhige00 commented 5 months ago

太感谢了！！！！！

yingzhige00 commented 5 months ago

大佬，模型调用处的代码我改在vitcross_seg_modeling.py的VisionTransformer的forward里面了，但是在x, y, attn_weights, features, cnn_x = self.transformer(x, y) # (B, n_patch, hidden)处，他只输出了四个变量，但是却要五个变量来接受，他报错了。

sstary commented 5 months ago

当然会报错。这个cnn_x你根据你自己的需要从transformer里面传出来就好啦，后面的也是这样。

yingzhige00 commented 5 months ago

原来如此

yingzhige00 commented 5 months ago

大佬，再麻烦一下，我太菜了。我在transformer中取出了Encoder返回的attn_weights列表的最后一个，它的形状是[1, 12, 256, 256]，而pred的形状是[1]，他们在经过 feature_grad = autograd.grad(pred, feature, allow_unused=True, retain_graph=True)[0] 后，输出的是None，导致后面的无法进行，这是哪里有问题，该怎么做

sstary commented 5 months ago

你取的东西有问题，传出去的cnn_x就是正常数据流中的feature maps即可。比如shallow-level的热力图通过如下代码传出去：

def forward(self, x, y):
    y = y.unsqueeze(1)
    if self.hybrid:
        x, y, features = self.hybrid_model(x, y)
    else:
        features = None
    cnn_x = x
    x = self.patch_embeddings(x)  # (B, hidden. n_patches^(1/2), n_patches^(1/2))
    y = self.patch_embeddingsd(y)
    x = x.flatten(2)
    x = x.transpose(-1, -2)  # (B, n_patches, hidden)
    y = y.flatten(2)
    y = y.transpose(-1, -2)

    embeddingsx = x + self.position_embeddings
    embeddingsx = self.dropout(embeddingsx)
    embeddingsy = y + self.position_embeddings
    embeddingsy = self.dropout(embeddingsy)
    # return embeddingsx, embeddingsy, features
    return embeddingsx, embeddingsy, features, cnn_x

这里的x，而不是attn_weights。这里改了后面class Transformer 的forward函数也得多一个参数。

sstary commented 5 months ago