NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.34k stars 1.39k forks source link

:If you _really_ know what you are doing, you can disable this warning by passing allow_banned=True to `amp.init()`. #504

Open molyswu opened 5 years ago

molyswu commented 5 years ago

enabled : True [1,1]:loss_scale : dynamic [1,1]:master_weights : None [1,1]:patch_torch_functions : True [1,1]:opt_level : O1 [1,1]:keep_batchnorm_fp32 : None [1,1]:cast_model_type : None [1,1]:Processing user overrides (additional kwargs that are not None)... [1,1]:After processing overrides, optimization options are: [1,1]:enabled : True [1,1]:loss_scale : dynamic [1,1]:master_weights : None [1,1]:patch_torch_functions : True [1,1]:opt_level : O1 [1,1]:keep_batchnorm_fp32 : None [1,1]:cast_model_type : None [1,0]: [1,0]: [1,0]:epoch: 0,batch: 0[1,0]: [1,0]:Traceback (most recent call last): [1,0]: File "train_1.py", line 187, in [1,0]: loss, outputs = model(imgs, targets) [1,0]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,0]: result = self.forward(*input, kwargs) [1,0]: File "/media/ai/sdc1/tools/horovod_study/yolov3-pytoch/PyTorch-YOLOv3/models.py", line 260, in forward [1,0]: x, layer_loss = module[0](x, targets, img_dim) [1,0]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,0]: result = self.forward(*input, *kwargs) [1,0]: File "/media/ai/sdc1/tools/horovod_study/yolov3-pytoch/PyTorch-YOLOv3/models.py", line 197, in forward [1,0]: loss_conf_obj = self.bce_loss(pred_conf[obj_mask], tconf[obj_mask]) [1,0]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,0]: result = self.forward(input, kwargs) [1,0]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/loss.py", line 498, in forward [1,0]: return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) [1,0]: File "/usr/local/lib/python3.5/dist-packages/apex/amp/wrap.py", line 124, in wrapper [1,0]: raise NotImplementedError(custom_err_msg) [1,0]:NotImplementedError: [1,0]:amp does not work out-of-the-box with F.binary_cross_entropy or torch.nn.BCELoss. It requires that the output of the previous function be already a FloatTensor. [1,0]: [1,0]:Most models have a Sigmoid right before BCELoss. In that case, you can use [1,0]: torch.nn.BCEWithLogitsLoss [1,0]:to combine Sigmoid+BCELoss into a single layer that is compatible with amp. [1,0]:Another option is to add [1,0]: amp.register_float_function(torch, 'sigmoid') [1,0]:before calling amp.init(). [1,0]:If you really know what you are doing, you can disable this warning by passing allow_banned=True to amp.init(). [1,1]:Traceback (most recent call last): [1,1]: File "train_1.py", line 187, in [1,1]: loss, outputs = model(imgs, targets) [1,1]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,1]: result = self.forward(*input, kwargs) [1,1]: File "/media/ai/sdc1/tools/horovod_study/yolov3-pytoch/PyTorch-YOLOv3/models.py", line 260, in forward [1,1]: x, layer_loss = module[0](x, targets, img_dim) [1,1]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,1]: result = self.forward(*input, *kwargs) [1,1]: File "/media/ai/sdc1/tools/horovod_study/yolov3-pytoch/PyTorch-YOLOv3/models.py", line 197, in forward [1,1]: loss_conf_obj = self.bce_loss(pred_conf[obj_mask], tconf[obj_mask]) [1,1]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 547, in call [1,1]: result = self.forward(input, kwargs) [1,1]: File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/loss.py", line 498, in forward [1,1]: return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) [1,1]: File "/usr/local/lib/python3.5/dist-packages/apex/amp/wrap.py", line 124, in wrapper [1,1]: raise NotImplementedError(custom_err_msg) [1,1]:NotImplementedError: [1,1]:amp does not work out-of-the-box with F.binary_cross_entropy or torch.nn.BCELoss. It requires that the output of the previous function be already a FloatTensor. [1,1]: [1,1]:Most models have a Sigmoid right before BCELoss. In that case, you can use [1,1]: torch.nn.BCEWithLogitsLoss [1,1]:to combine Sigmoid+BCELoss into a single layer that is compatible with amp. [1,1]:Another option is to add [1,1]: amp.register_float_function(torch, 'sigmoid') [1,1]:before calling amp.init(). [1,1]:If you really know what you are doing, you can disable this warning by passing allow_banned=True to amp.init().

ptrblck commented 5 years ago

Double post from here. As the error message suggests, you could e.g. change the criterion to nn.BCEWithLogitsLoss and remove the sigmoid, register the sigmoid as a float function, or disable the warning.