yangsenius / TransPose

PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.
https://github.com/yangsenius/TransPose/releases/download/paper/transpose.pdf
MIT License
358 stars 57 forks source link

Target heatmaps #31

Closed mukeshnarendran7 closed 2 years ago

mukeshnarendran7 commented 2 years ago

When I prepare target heatmaps for the coordinates from an image of 256 256 to 64 48 the minimum are very small heatmap values ex: 1.3e-35 (problem) range but the maximum is in 0.99 (fine). Do you encounter such a phenomenon while preparing heatmaps?

Is there a way to avoid this? Thanks

yangsenius commented 2 years ago

You may apply the gaussian distribution to the whole heatmap. There is a solution to generate gaussian peak in a local window, so that the minimum values in most areas are zeros.

https://github.com/microsoft/human-pose-estimation.pytorch/blob/18f1d0fa5b5db7fe08de640610f3fdbdbed8fb2f/lib/dataset/JointsDataset.py#L210

mukeshnarendran7 commented 2 years ago

Hi Thanks, for getting back that helped he problem with the values. I have tried to train the model with my own dataset by changing only the final conv layer but it doesn't seem to be learning at all. The loss decreases a from 0.1 to 0.09 and then stagnates around there. However the final predictions are completely off and it hasn't learnt on the training data for around 60 epochs. Is there something else i need to consider while trying to fine-tune the model for another use case application

I am using: criterion = torch.nn.MSELoss(reduction="mean") or JoinMSELOSS optimizer = torch.optim.Adam(model.parameters(), lr=3e-4,weight_decay=1e-3) targets = (8, 18, 64, 48) sigma = 2

I also get the following error when loading the pre-trained model: Using cache found in /root/.cache/torch/hub/yangsenius_TransPose_main /root/.cache/torch/hub/yangsenius_TransPose_main/lib/models/transpose_r.py:333: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). dim_t = temperature * (2 (dim_t // 2) / one_direction_feats)

yangsenius commented 2 years ago

Hi @mukeshnarendran7. When you define the optimizer, you should do like below:

pretrain_part = [param if 'final_layer' not in name for name, param in model.named_parameters() ]

optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },
                                                    {'params': model.final_layer.parameters(), 'lr': 1e-4}])

And in our MPII practice, the learning rate for pretrain_part is set to 1e-5 without chaning, while the learning rate for the final_layer (1x1 conv) decays from 1e-4 to 1e-5.

For another issue, I haven't encountered this. What's your PyTorch version? I successfully loaded the pretrained models in PyTorch 1.6 or 1.7, such as in colab

yangsenius commented 2 years ago

It seems that this UserWarning: floordiv is deprecated is a pytorch version bug.

mukeshnarendran7 commented 2 years ago

Hey , I tried what you suggested me above and also tried the model without pre-training it looks like it doesn't learn much on the images but the loss curves go down smoothly. The only part i am making some changes is from the data generation of the heatmaps. They seem to look fine when plotted. I tried to adapt some part of the code you earlier suggested for heatmaps generation:

def adjust_targets(x, y, tmp_size, to_size):
    """Adjust the targets"""
    # feat_stride = self.image_size / self.heatmap_size
    mu_x = x
    mu_y = y
    # Check that any part of the gaussian is in-bounds
    ul = [int(mu_x - tmp_size), int(mu_y - tmp_size)]
    br = [int(mu_x + tmp_size + 1), int(mu_y + tmp_size + 1)]
    # Usable gaussian range
    g_x = max(0, -ul[0]), min(br[0], to_size[0]) - ul[0]
    g_y = max(0, -ul[1]), min(br[1], to_size[1]) - ul[1]
    # Image range
    img_x = max(0, ul[0]), min(br[0], to_size[0])
    img_y = max(0, ul[1]), min(br[1], to_size[1])
    if ul[0] >= to_size[0] or ul[1] >= to_size[1] or br[0] < 0 or br[1] < 0:
        # If not, just return the image as is
        x = 0
        y = 0
    else:
        x = mu_x
        y = mu_y

    return x,y, g_x, g_y, img_x, img_y
def get_heatmaps_likelihood(sample, 
                            sigma=2, 
                            to_size=(48,64),
                            normalize=True):
    """
    Generates heatmaps from the keypoints of a sample
    : sample: The sample from which to generate heatmaps.
    :sigma: The standard deviation of the gaussian noise
    : normalize: Whether to normalize the heatmaps
    :return: The heatmaps of the keypoints
    """
    image, keypoints = sample['image'], sample['keypoints']
    h,w = to_size #heatmaps to shape
    x = np.arange(w)
    y = np.arange(h)
    xx, yy = np.meshgrid(x, y)
    tmp_size = sigma*3
    heatmaps = np.zeros([len(keypoints), h, w]) 

    for i, (x, y) in enumerate(keypoints):
        x, y, g_x, g_y, img_x, img_y = adjust_targets((x/256)*64, (y/256)*48, tmp_size, to_size)

        # Gaussian distribution with peak at the keypoint annotation
        heatmaps[i]= np.exp(-((yy - y)** 2 + (xx - x) ** 2)/(2 * sigma ** 2)).astype(np.float16)

        if not normalize:
            heatmaps[i] /= sigma * np.sqrt(2 * np.pi)

    return heatmaps
yangsenius commented 2 years ago

Why not using like that? Accoding to your initial purpose.

g = np.exp(-((yy - y)** 2 + (xx - x) ** 2)/(2 * sigma ** 2)).astype(np.float16)
heatmaps[i][img_y[0]:img_y[1], img_x[0]:img_x[1]] =  g[g_y[0]:g_y[1], g_x[0]:g_x[1]]