HRNet / HRNet-Facial-Landmark-Detection

This is an official implementation of facial landmark detection for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
MIT License
1.03k stars 261 forks source link

Suggestions for adjusting scoremap #60

Open mucunwuxian opened 4 years ago

mucunwuxian commented 4 years ago

Hi. First thanks a lot for your work, I could archieve incredible results with it! I love HRNet.

So, I'm sorry for being rough, but let me have a discussion. How about adjusting scoremap such as the following template matching? Or did you already try it?

output = model(inp)
score_map = output.data.cpu()
if (True):
    score_map = adjust_scoremap(score_map, config.MODEL.SIGMA, config.MODEL.TARGET_TYPE) # trial...
preds = decode_preds(score_map, meta['center'], meta['scale'], [64, 64])
def adjust_scoremap(score_map, sigma, label_type='Gaussian'):
    # Check that any part of the gaussian is in-bounds
    tmp_size = sigma * 3

    # Generate gaussian
    size = 2 * tmp_size + 1
    x = np.arange(0, size, 1, np.float32)
    y = x[:, np.newaxis]
    x0 = y0 = size // 2
    # The gaussian is not normalized, we want the center value to equal 1
    if label_type == 'Gaussian':
        g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
    else:
        g = sigma / (((x - x0) ** 2 + (y - y0) ** 2 + sigma ** 2) ** 1.5)

    # adjust for template matching
    g = (g - np.mean(g)) / np.std(g)

    # for the method of conv, refer to [https://discuss.pytorch.org/t/torch-nn-conv2d-custom-filters/17694]
    g = g[np.newaxis, np.newaxis, :, :]
    g = torch.tensor(g.astype(np.float32))
    for i in range(score_map.size(1)):
        score_map_tmp = score_map[:, i, :, :].unsqueeze(1)
        score_map_tmp = (score_map_tmp - torch.mean(score_map_tmp)) / torch.std(score_map_tmp)
        score_map_tmp = torch.nn.functional.conv2d(score_map_tmp, g, padding=int(size//2))
        score_map[:, i, :, :] = score_map_tmp[:, 0, :, :]

    return score_map

Best regards!

Sierkinhane commented 4 years ago

Is there an improvement?

mucunwuxian commented 3 years ago

@Sierkinhane Thank you for your comment!

My proposal is to calculate NCC (Normalized Cross-Correlation) with image and 2d gauss distribution. I think, it can adjust heatmap to smooth, and output coordinates becom stable. How about it?

In addition, when I actually tested it on the video, I confirmed that the coordinates quivered.

Best regards!