may be some error: loss_triplet.backward()

yulinhuyang commented 4 years ago

tensor([0.8285, 0.7775, 0.6481, 0.6993, 0.6650, 0.7633, 0.7279, 0.6936, 0.6447, 0.7665, 0.6792, 0.7340, 0.7391, 0.8362, 0.6890, 0.6917, 0.7076, 0.7717, 0.6592, 0.7357, 0.6919, 0.7684, 0.5912, 0.7161, 0.6979, 0.7280, 0.6658, 0.7369, 0.6964, 0.7534, 0.7353, 0.7866], device='cuda:2', grad_fn=) Traceback (most recent call last): File "train-soft-margin-vps.py", line 431, in num_epochs=70) File "train-soft-margin-vps.py", line 274, in train_model loss_triplet.backward() File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 166, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward grad_tensors = _make_grads(tensors, grad_tensors) File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 34, in _make_grads raise RuntimeError("grad can be implicitly created only for scalar outputs")

layumi commented 4 years ago

Hi @yulinhuyang Did you use fp16 or multiple GPU ?

yulinhuyang commented 4 years ago

note use fp18 and multiple GPU。use one gpu

the whole : [Resize(size=(256, 256), interpolation=PIL.Image.BICUBIC), ToTensor(), Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])] 4.76837158203125e-07

Epoch 0/69

/home/aita/anaconda3/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:100: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) tensor([0.6783, 0.7411, 0.7332, 0.6802, 0.7062, 0.7206, 0.7476, 0.6559, 0.7280, 0.7294, 0.7672, 0.7622, 0.7835, 0.8170, 0.7035, 0.6644, 0.6814, 0.7099, 0.7032, 0.6591, 0.6681, 0.7897, 0.7294, 0.7421, 0.7200, 0.7558, 0.7209, 0.7722, 0.8015, 0.6677, 0.7251, 0.7755], device='cuda:5', grad_fn=) Traceback (most recent call last): File "train-soft-margin-vps.py", line 431, in num_epochs=70) File "train-soft-margin-vps.py", line 274, in train_model loss_triplet.backward() File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 166, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward grad_tensors = _make_grads(tensors, grad_tensors) File "/home/aita/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 34, in _make_grads raise RuntimeError("grad can be implicitly created only for scalar outputs") RuntimeError: grad can be implicitly created only for scalar outputs

and I use: torch 1.3.1 torchvision 0.4.2

layumi commented 4 years ago

The loss_triplet should be one scalar rather than Tensor. Do you change any training setting? You may check whether the sum function works at this line https://github.com/layumi/University1652-triplet-loss/blob/master/train.py#L250

yulinhuyang commented 4 years ago

Thank you very much for your work！ I am so sorry.I didn't make it clear. I meet the error when I run train-soft-margin.py. Train.py and train-contrastive.py is running OK。

layumi commented 4 years ago

@yulinhuyang I forgot the details. But you may first try to add one torch.sum.

layumi / University1652-triplet-loss

may be some error: loss_triplet.backward() #1

Epoch 0/69