out of memory when running

zhenyezi commented 6 years ago

hello, when I running this procedure， it will appear out of memory when run halfway, so I want to know the reason, thank you!

rainofmine commented 6 years ago

If images in your dataset do not have the same size, the memory needed is always changing. You can reduce the input size in dataloader.py or use smaller batchsize.

nihaoxiaoli commented 5 years ago

I also meet this issue. The gpu memory will increase during training.

ehp commented 5 years ago

Method calc_iou in losses.py sometimes creates HUGE tensors. You have to compute tensors partially:

def calc_iou(a, b):
    step = 20
    IoU = torch.zeros((len(a), len(b))).cuda()
    step_count = int(len(b) / step)
    if len(b) % step != 0:
        step_count += 1

    area = (b[:, 2] - b[:, 0]) * (b[:, 3] - b[:, 1])

    for i in range(step_count):
        iw = torch.min(torch.unsqueeze(a[:, 2], dim=1), b[i * step:(i + 1) * step, 2])
        iw.sub_(torch.max(torch.unsqueeze(a[:, 0], 1), b[i * step:(i + 1) * step, 0]))

        ih = torch.min(torch.unsqueeze(a[:, 3], dim=1), b[i * step:(i + 1) * step, 3])
        ih.sub_(torch.max(torch.unsqueeze(a[:, 1], 1), b[i * step:(i + 1) * step, 1]))

        iw.clamp_(min=0)
        ih.clamp_(min=0)

        iw.mul_(ih)
        del ih

        ua = torch.unsqueeze((a[:, 2] - a[:, 0]) * (a[:, 3] - a[:, 1]), dim=1) + area[i * step:(i + 1) * step] - iw
        ua = torch.clamp(ua, min=1e-8)
        iw.div_(ua)
        del ua

        IoU[:, i * step:(i + 1) * step] = iw

    return IoU

rainofmine / Face_Attention_Network

out of memory when running #3