WongKinYiu / ScaledYOLOv4

Scaled-YOLOv4: Scaling Cross Stage Partial Network
GNU General Public License v3.0
2.02k stars 572 forks source link

TypeError: can't convert cuda:0 device type tensor to numpy #382

Open CdAB63 opened 2 years ago

CdAB63 commented 2 years ago

Sistematic error. Platform Ubuntu. Cuda downgraded to 10.2 torch downgraded to 10.0 and torchview downgraded to match

└─$ python test.py --img 896 --conf 0.001 --batch 8 --device 0 --data coco.yaml --weights weights/yolov4-p5.pt

Namespace(weights=['weights/yolov4-p5.pt'], data='./data/coco.yaml', batch_size=8, img_size=896, conf_thres=0.001, iou_thres=0.65, save_json=True, task='val', device='0', single_cls=False, augment=False, merge=False, verbose=False, save_txt=False)
Using CUDA device0 _CudaDeviceProperties(name='NVIDIA GeForce GTX 1060', total_memory=6078MB)

Fusing layers... Model Summary: 331 layers, 7.07943e+07 parameters, 6.81919e+07 gradients
/home/cdab63/.local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

Scanning labels ../coco/val2017.cache (0 found, 0 missing, 5000 empty, 0 duplicate, for 5000 images): 100%|████████████| 5000/5000 [00:00<00:00, 777904.22it/s]

WARNING: No labels found in ../coco/val2017/. See

           Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95:   0%|                                    | 0/625 [00:00<?, ?it/s]  

Traceback (most recent call last):
File "/home/cdab63/Desenvolvimento/Deep-Learning/ScaledYOLOv4/test.py", line 269, in test(opt.data,
File "/home/cdab63/Desenvolvimento/Deep-Learning/ScaledYOLOv4/test.py", line 189, in test
plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions
File "/home/cdab63/Desenvolvimento/Deep-Learning/ScaledYOLOv4/utils/general.py", line 1103, in output_to_target
return np.array(targets)
File "/home/cdab63/.local/lib/python3.9/site-packages/torch/_tensor.py", line 678, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

CdAB63 commented 2 years ago

It can be fixed as follows:

def output_to_target(output, width, height):

# Convert model output to target format [batch_id, class_id, x, y, w, h, conf]  
if isinstance(output, torch.Tensor):  
    output = output.cpu().numpy()  
targets = []  
for i, o in enumerate(output)  
    if o is not None:  
        for pred in o:  
            box = pred[:4]  
            w = (box[2] - box[0]) / width  
            h = (box[3] - box[1]) / height  
            x = box[0] / width + w / 2  
            y = box[1] / height + h / 2  
            conf = pred[4]  
            cls = int(pred[5])  
            targets.append([i, cls, float(x.cpu()),   
                                    float(y.cpu()),   
                                    float(w.cpu()),   
                                    float(h.cpu()),   
                                    float(conf.cpu())])  
return np.array(targets) 

Pabligme commented 2 years ago

In the last days I have been in the same problem, I did the same as you, but the disk crashed (Google Colab when trying to train Scaled YOLOv4). Do you know how can I fix it?

Pabligme commented 2 years ago

I fixed it by using a smaller batch size

kXborg commented 2 years ago

I have faced the same error and fixed it using .cpu().numpy(). However, theere are some issues with the detections. I can see the first detections as nan type. There are no detected bounding box while inferencing.

The same issue is not present with CPU inference. If anyone found the solution, please advise.