zzzxxxttt / pytorch_simple_CornerNet

A simple pytorch implementation of CornerNet
30 stars 2 forks source link

error in _ae_loss #2

Open VICKY1991 opened 4 years ago

VICKY1991 commented 4 years ago

pull_loss, push_loss = _ae_loss(embd_tl, embd_br, batch['ind_masks']) File "/home/sambit/PROGRAMMING/pytorch_simple_CornerNet-master/utils/losses.py", line 84, in _ae_loss dist = F.relu(1 - (embd_mean[:, None, :] - embd_mean[:, :, None]).abs(), inplace=True) IndexError: too many indices for tensor of dimension 1

zzzxxxttt commented 4 years ago

What is your batchsize? The ae loss function is incompatible with batchsize==1.

PRITISHA2105 commented 4 years ago

Thank you so much for the feedback. This is indeed my batch size. (1) Unfortunately, I have only a RTX2080 GPU, with 8GB. So, I found that batch size > 1 causes out of memory error. Just a question - the forward + backward has a memory foot print > 7GB. Does it mean only the training phase needs > 8GB memory and final trained weights are about 800MB? Thank you again.

zzzxxxttt commented 4 years ago

@PRITISHA2105 Yes, training requires more memory than evaluation, try to use Hourglass-52 ("small hourglass") as backbone if you don't have enough memory.