VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
https://arxiv.org/abs/2301.12900
MIT License
2.6k stars 321 forks source link

> 我在执行过程中遇到代码错误,指出张量不在同一设备上。@Hyunseok-Kim0你能给我解释一下吗,谢谢 #344

Open Yuanshan627 opened 6 months ago

Yuanshan627 commented 6 months ago
          > 我在执行过程中遇到代码错误,指出张量不在同一设备上。@Hyunseok-Kim0你能给我解释一下吗,谢谢

That's the problem. How did you solve it?

Originally posted by @aicmaodyu in https://github.com/VainF/Torch-Pruning/issues/147#issuecomment-1727343947 解决了吗兄弟?我也是张量不在同一个设备上

janthmueller commented 6 months ago

When setting up the device of a tensor, there are two primary methods:

  1. Explicit definition: You can explicitly define the device for a tensor, specifying whether it should be located on the GPU (cuda) or CPU (cpu).

  2. Default device: Alternatively, tensors can be assigned to the default device, which is determined by the environment and typically defaults to the CPU or GPU.

However, if you consistently use the default device for all tensors and then attempt to assign a tensor to a different device (e.g., cuda) that is not the default, it can lead to an error. This occurs because the default device remains unchanged, and attempting to switch a single tensor to a different device conflicts with the existing default.

To avoid such errors, one common practice is to set the default tensor type before creating any tensors, ensuring consistency across all tensors within your program. This can be achieved, for example, by checking the desired device (e.g., "cuda") and setting the default tensor type accordingly:

if args.device == "cuda":
    torch.cuda.set_default_tensor_type(torch.cuda.FloatTensor)

By setting the default tensor type upfront based on the desired device, you ensure that all subsequently created tensors adhere to the specified device type, minimizing potential conflicts and errors.

lwDavid commented 5 months ago

I fixed this bug by changing code in loss computing.

# ultralytics/ultralytics/utils/loss.py    line 187:
def bbox_decode(self, anchor_points, pred_dist):
    """Decode predicted object bounding box coordinates from anchor points and distribution."""
    if self.use_dfl:
        b, a, c = pred_dist.shape  # batch, anchors, channels

        # Push self.proj to cuda device. 'cuda:0' in this example.
        myDevice = torch.device('cuda:0')  
        self.proj = self.proj.to(myDevice)

        pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype))
        # pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype))
        # pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2)
    return dist2bbox(pred_dist, anchor_points, xywh=False)

Here I use two lines to send self.proj into my cuda device instead of cpu and fixed the problem. If this doesn't work, try to change the device setting in default.yaml as 'device=0'.

Yuanshan627 commented 5 months ago

Thank you very much, it works

------------------ 原始邮件 ------------------ 发件人: "VainF/Torch-Pruning" @.>; 发送时间: 2024年3月27日(星期三) 下午2:33 @.>; @.**@.>; 主题: Re: [VainF/Torch-Pruning] > @.***你能给我解释一下吗,谢谢 (Issue #344)

I fixed this bug by changing code in loss computing. `

ultralytics/ultralytics/utils/loss.py line187:

def bbox_decode(self, anchor_points, pred_dist): """Decode predicted object bounding box coordinates from anchor points and distribution.""" if self.use_dfl: b, a, c = pred_dist.shape # batch, anchors, channels myDevice = torch.device('cuda:0') self.proj = self.proj.to(myDevice) pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype)) # pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype)) # pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2) return dist2bbox(pred_dist, anchor_points, xywh=False)
` Here I use two lines to send self.proj into my cuda device instead of cpu and fixed the problem. If this doesn't work, try to change the device setting in default.yaml as 'device=0'.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>