speedinghzl / CCNet

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).
MIT License
1.42k stars 277 forks source link

A question about Visualization #3

Closed hellodfan closed 5 years ago

hellodfan commented 5 years ago

nice jobs! Do you mind to provide the code or details about how to implement the visualization of the attention map?

speedinghzl commented 5 years ago

Thanks for your attention. The code maybe helpful.

def shows_attention1(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)
    att_vector = att_map1[0,:,pos[0], pos[1]]
    for i in range(w):
        vis_map[pos[0], i] = att_vector[i]
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        vis_map[new_i, pos[1]] = att_vector[i]
    return vis_map

def shows_attention2(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    att_map2 = att_maps[1]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)

    att_vector = att_map2[0,:,pos[0], pos[1]]
    for i in range(w):
        map_step1 = shows_attention1(att_maps, pos=[pos[0], i])
        vis_map += att_vector[i] * map_step1
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        map_step1 = shows_attention1(att_maps, pos=[new_i, pos[1]])
        vis_map += att_vector[i] * map_step1
    return vis_map

def make_image(vis_map, outputname):
    import matplotlib.pyplot as plt
    fig = plt.imshow(vis_map, cmap='hot', interpolation='bilinear')
    fig.axes.get_xaxis().set_visible(False)
    fig.axes.get_yaxis().set_visible(False)
    plt.margins(0,0)
    plt.savefig(outputname)

Then you can visualize the attention map.

# att_maps is the attention map of RCCA
att_maps = [att.data.cpu().numpy() for att in att_maps]
vis_map = shows_attention2(att_maps, [19, 70])
make_image(vis_map, 'attention_vis.png')

PS. We Inhibit the maximal value in vis_map for clear visualization.

krishnakanthnakka commented 4 years ago

Dear Author,

Can you please elaborate on how do you inhibit maximal value?

By running the above code, the attention map with. R=2 is given as below. However, the results from first row of fig 6 of the paper covers the context from entire car region while the below focuses on axes directions.

attention_vis

speedinghzl commented 4 years ago

Hi @krishnakanthnakka, I set the weight (w) in the attention map greater than the threshold (t) to t for clear visualization. (in short w = t if w > t).

krishnakanthnakka commented 4 years ago

Thank you for clarifying. I was expecting for the given point on car, it attends to all car pixels with high values and other classes with low attention.

It looks to me, even with recurrent connections, the context majorly comes from axial directions only.

speedinghzl commented 4 years ago

It's quite normal that the pixels on the criss-cross path get higher weights in CCNet. The recurrent connections make it possible to aggregate context from outside the criss-cross path. Besides, the amount of pixels outside the criss-cross path is far greater than pixels in the criss-cross path, the context from pixels outside the criss-cross path is also considerable.

chixma commented 3 years ago

Hi @krishnakanthnakka, I set the weight (w) in the attention map greater than the threshold (t) to t for clear visualization. (in short w = t if w > t).

How do you set the threshold (t)? Is it a constant number? If so, which number?

lizijue commented 2 years ago

Hello@speedinghzl:

I guess the reason why are the weights of pixels in criss-cross path so high in Fig. 6 when R = 2 is that these pixels have built a connection with the pixel marked by green so that in the second CCA these pixels in criss cross path would be enhanced again, leading to a high values in the attention map.

But I found that most of these pixels in the raw image are far from the pixel marked in green in semantic, so is this phenomenon is a bug in ccnet?

It is kind of you if you could help me.

speedinghzl commented 2 years ago

But I found that most of these pixels in the raw image are far from the pixel marked in green in semantic, so is this phenomenon is a bug in ccnet?

Hi @lizijue, Could you provide some numerical results or a visual example?

lizijue commented 2 years ago

No, I just refer to the visualization result in FIgure 6 in your paper. As you can see, weigths of pixels in the criss cross path are very high when R = 2, but I found that many pixels in the raw image not lie in the same category as shown in Ground Truth.

I know it is probably due to the mechanism of CCA itself, but I wonder if this is normal? @speedinghzl

图片

ummagumm-a commented 1 year ago

Thanks for your attention. The code maybe helpful.

def shows_attention1(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)
    att_vector = att_map1[0,:,pos[0], pos[1]]
    for i in range(w):
        vis_map[pos[0], i] = att_vector[i]
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        vis_map[new_i, pos[1]] = att_vector[i]
    return vis_map

def shows_attention2(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    att_map2 = att_maps[1]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)

    att_vector = att_map2[0,:,pos[0], pos[1]]
    for i in range(w):
        map_step1 = shows_attention1(att_maps, pos=[pos[0], i])
        vis_map += att_vector[i] * map_step1
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        map_step1 = shows_attention1(att_maps, pos=[new_i, pos[1]])
        vis_map += att_vector[i] * map_step1
    return vis_map

def make_image(vis_map, outputname):
    import matplotlib.pyplot as plt
    fig = plt.imshow(vis_map, cmap='hot', interpolation='bilinear')
    fig.axes.get_xaxis().set_visible(False)
    fig.axes.get_yaxis().set_visible(False)
    plt.margins(0,0)
    plt.savefig(outputname)

Then you can visualize the attention map.

# att_maps is the attention map of RCCA
att_maps = [att.data.cpu().numpy() for att in att_maps]
vis_map = shows_attention2(att_maps, [19, 70])
make_image(vis_map, 'attention_vis.png')

PS. We Inhibit the maximal value in vis_map for clear visualization.

Hi. Thank you for this work. Can you please specify what input this visualization function expects? Can't find anything like 'att_maps' in RCCA.

ltwwwww commented 1 year ago

Thanks for your attention. The code maybe helpful.

def shows_attention1(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)
    att_vector = att_map1[0,:,pos[0], pos[1]]
    for i in range(w):
        vis_map[pos[0], i] = att_vector[i]
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        vis_map[new_i, pos[1]] = att_vector[i]
    return vis_map

def shows_attention2(att_maps, pos = [28, 122]):
    att_map1 = att_maps[0]
    att_map2 = att_maps[1]
    _, _, h, w = att_map1.shape
    vis_map = np.zeros((h, w), dtype=np.float32)

    att_vector = att_map2[0,:,pos[0], pos[1]]
    for i in range(w):
        map_step1 = shows_attention1(att_maps, pos=[pos[0], i])
        vis_map += att_vector[i] * map_step1
    for i in range(w,h+w-1):
        new_i = i - w
        if i >= pos[0]:
            new_i = new_i + 1
        map_step1 = shows_attention1(att_maps, pos=[new_i, pos[1]])
        vis_map += att_vector[i] * map_step1
    return vis_map

def make_image(vis_map, outputname):
    import matplotlib.pyplot as plt
    fig = plt.imshow(vis_map, cmap='hot', interpolation='bilinear')
    fig.axes.get_xaxis().set_visible(False)
    fig.axes.get_yaxis().set_visible(False)
    plt.margins(0,0)
    plt.savefig(outputname)

Then you can visualize the attention map.

# att_maps is the attention map of RCCA
att_maps = [att.data.cpu().numpy() for att in att_maps]
vis_map = shows_attention2(att_maps, [19, 70])
make_image(vis_map, 'attention_vis.png')

PS. We Inhibit the maximal value in vis_map for clear visualization.

Hi. Thank you for this work. Can you please specify what input this visualization function expects? Can't find anything like 'att_maps' in RCCA.

Hello! Did you solve it? I have the same question.