Tianxiaomo / pytorch-YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4
Apache License 2.0
4.46k stars 1.49k forks source link

Torch2ONNX _pickle.UnpicklingError: invalid load key, '<' #495

Closed mfoglio closed 2 years ago

mfoglio commented 2 years ago

Within the docker container nvcr.io/nvidia/deepstream:6.0-triton:

mkdir /src
cd /src
git clone https://github.com/Tianxiaomo/pytorch-YOLOv4.git
cd pytorch-YOLOv4
apt -y install python3-venv
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip3 install \
numpy==1.18.2 \
torch==1.4.0 \
tensorboardX==2.0 \
scikit_image==0.16.2 \
matplotlib==2.2.3 \
tqdm==4.43.0 \
easydict==1.9 \
Pillow==7.1.2 \
opencv_python \
onnx \
onnxruntime

wget --no-check-certificate "https://docs.google.com/uc?export=download&id=1wv_LiFeCRYwtpkqREPeI13-gPELBDwuJ" -r -A 'uc*' -e robots=off -nd -O yolov4.pth
python3 demo_pytorch2onnx.py yolov4.pth data/dog.jpg 8 80 416 416

Error:

(venv) root@ip-172-31-9-127:/src/pytorch-YOLOv4# python3 demo_pytorch2onnx.py ./yolov4.pth data/dog.jpg 8 80 416 416
Converting to onnx and running demo ...
Traceback (most recent call last):
  File "demo_pytorch2onnx.py", line 96, in <module>
    main(weight_file, image_path, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 72, in main
    transform_to_onnx(weight_file, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 19, in transform_to_onnx
    pretrained_dict = torch.load(weight_file, map_location=torch.device('cuda'))
  File "/src/pytorch-YOLOv4/venv/lib/python3.8/site-packages/torch/serialization.py", line 529, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/src/pytorch-YOLOv4/venv/lib/python3.8/site-packages/torch/serialization.py", line 692, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
mfoglio commented 2 years ago

Update: I think the wget command was not downloading the yolo weights appropriately. Downloading them manually seems to fix the error above. But now I have the following error:

Converting to onnx and running demo ...
Traceback (most recent call last):
  File "demo_pytorch2onnx.py", line 96, in <module>
    main(weight_file, image_path, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 72, in main
    transform_to_onnx(weight_file, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 20, in transform_to_onnx
    model.load_state_dict(pretrained_dict)
  File "/src/pytorch-YOLOv4/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 829, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Yolov4:
        Missing key(s) in state_dict: "neck.conv1.conv.0.weight", "neck.conv1.conv.1.weight", "neck.conv1.conv.1.bias", "neck.conv1.conv.1.running_mean", "neck.conv1.conv.1.running_var", "neck.conv2.conv.0.weight", "neck.conv2.conv.1.weight", "neck.conv2.conv.1.bias", "neck.conv2.conv.1.running_mean", "neck.conv2.conv.1.running_var", "neck.conv3.conv.0.weight", "neck.conv3.conv.1.weight", "neck.conv3.conv.1.bias", "neck.conv3.conv.1.running_mean", "neck.conv3.conv.1.running_var", "neck.conv4.conv.0.weight", "neck.conv4.conv.1.weight", "neck.conv4.conv.1.bias", "neck.conv4.conv.1.running_mean", "neck.conv4.conv.1.running_var", "neck.conv5.conv.0.weight", "neck.conv5.conv.1.weight", "neck.conv5.conv.1.bias", "neck.conv5.conv.1.running_mean", "neck.conv5.conv.1.running_var", "neck.conv6.conv.0.weight", "neck.conv6.conv.1.weight", "neck.conv6.conv.1.bias", "neck.conv6.conv.1.running_mean", "neck.conv6.conv.1.running_var", "neck.conv7.conv.0.weight", "neck.conv7.conv.1.weight", "neck.conv7.conv.1.bias", "neck.conv7.conv.1.running_mean", "neck.conv7.conv.1.running_var", "neck.conv8.conv.0.weight", "neck.conv8.conv.1.weight", "neck.conv8.conv.1.bias", "neck.conv8.conv.1.running_mean", "neck.conv8.conv.1.running_var", "neck.conv9.conv.0.weight", "neck.conv9.conv.1.weight", "neck.conv9.conv.1.bias", "neck.conv9.conv.1.running_mean", "neck.conv9.conv.1.running_var", "neck.conv10.conv.0.weight", "neck.conv10.conv.1.weight", "neck.conv10.conv.1.bias", "neck.conv10.conv.1.running_mean", "neck.conv10.conv.1.running_var", "neck.conv11.conv.0.weight", "neck.conv11.conv.1.weight", "neck.conv11.conv.1.bias", "neck.conv11.conv.1.running_mean", "neck.conv11.conv.1.running_var", "neck.conv12.conv.0.weight", "neck.conv12.conv.1.weight", "neck.conv12.conv.1.bias", "neck.conv12.conv.1.running_mean", "neck.conv12.conv.1.running_var", "neck.conv13.conv.0.weight", "neck.conv13.conv.1.weight", "neck.conv13.conv.1.bias", "neck.conv13.conv.1.running_mean", "neck.conv13.conv.1.running_var", "neck.conv14.conv.0.weight", "neck.conv14.conv.1.weight", "neck.conv14.conv.1.bias", "neck.conv14.conv.1.running_mean", "neck.conv14.conv.1.running_var", "neck.conv15.conv.0.weight", "neck.conv15.conv.1.weight", "neck.conv15.conv.1.bias", "neck.conv15.conv.1.running_mean", "neck.conv15.conv.1.running_var", "neck.conv16.conv.0.weight", "neck.conv16.conv.1.weight", "neck.conv16.conv.1.bias", "neck.conv16.conv.1.running_mean", "neck.conv16.conv.1.running_var", "neck.conv17.conv.0.weight", "neck.conv17.conv.1.weight", "neck.conv17.conv.1.bias", "neck.conv17.conv.1.running_mean", "neck.conv17.conv.1.running_var", "neck.conv18.conv.0.weight", "neck.conv18.conv.1.weight", "neck.conv18.conv.1.bias", "neck.conv18.conv.1.running_mean", "neck.conv18.conv.1.running_var", "neck.conv19.conv.0.weight", "neck.conv19.conv.1.weight", "neck.conv19.conv.1.bias", "neck.conv19.conv.1.running_mean", "neck.conv19.conv.1.running_var", "neck.conv20.conv.0.weight", "neck.conv20.conv.1.weight", "neck.conv20.conv.1.bias", "neck.conv20.conv.1.running_mean", "neck.conv20.conv.1.running_var". 
        Unexpected key(s) in state_dict: "neek.conv1.conv.0.weight", "neek.conv1.conv.1.weight", "neek.conv1.conv.1.bias", "neek.conv1.conv.1.running_mean", "neek.conv1.conv.1.running_var", "neek.conv1.conv.1.num_batches_tracked", "neek.conv2.conv.0.weight", "neek.conv2.conv.1.weight", "neek.conv2.conv.1.bias", "neek.conv2.conv.1.running_mean", "neek.conv2.conv.1.running_var", "neek.conv2.conv.1.num_batches_tracked", "neek.conv3.conv.0.weight", "neek.conv3.conv.1.weight", "neek.conv3.conv.1.bias", "neek.conv3.conv.1.running_mean", "neek.conv3.conv.1.running_var", "neek.conv3.conv.1.num_batches_tracked", "neek.conv4.conv.0.weight", "neek.conv4.conv.1.weight", "neek.conv4.conv.1.bias", "neek.conv4.conv.1.running_mean", "neek.conv4.conv.1.running_var", "neek.conv4.conv.1.num_batches_tracked", "neek.conv5.conv.0.weight", "neek.conv5.conv.1.weight", "neek.conv5.conv.1.bias", "neek.conv5.conv.1.running_mean", "neek.conv5.conv.1.running_var", "neek.conv5.conv.1.num_batches_tracked", "neek.conv6.conv.0.weight", "neek.conv6.conv.1.weight", "neek.conv6.conv.1.bias", "neek.conv6.conv.1.running_mean", "neek.conv6.conv.1.running_var", "neek.conv6.conv.1.num_batches_tracked", "neek.conv7.conv.0.weight", "neek.conv7.conv.1.weight", "neek.conv7.conv.1.bias", "neek.conv7.conv.1.running_mean", "neek.conv7.conv.1.running_var", "neek.conv7.conv.1.num_batches_tracked", "neek.conv8.conv.0.weight", "neek.conv8.conv.1.weight", "neek.conv8.conv.1.bias", "neek.conv8.conv.1.running_mean", "neek.conv8.conv.1.running_var", "neek.conv8.conv.1.num_batches_tracked", "neek.conv9.conv.0.weight", "neek.conv9.conv.1.weight", "neek.conv9.conv.1.bias", "neek.conv9.conv.1.running_mean", "neek.conv9.conv.1.running_var", "neek.conv9.conv.1.num_batches_tracked", "neek.conv10.conv.0.weight", "neek.conv10.conv.1.weight", "neek.conv10.conv.1.bias", "neek.conv10.conv.1.running_mean", "neek.conv10.conv.1.running_var", "neek.conv10.conv.1.num_batches_tracked", "neek.conv11.conv.0.weight", "neek.conv11.conv.1.weight", "neek.conv11.conv.1.bias", "neek.conv11.conv.1.running_mean", "neek.conv11.conv.1.running_var", "neek.conv11.conv.1.num_batches_tracked", "neek.conv12.conv.0.weight", "neek.conv12.conv.1.weight", "neek.conv12.conv.1.bias", "neek.conv12.conv.1.running_mean", "neek.conv12.conv.1.running_var", "neek.conv12.conv.1.num_batches_tracked", "neek.conv13.conv.0.weight", "neek.conv13.conv.1.weight", "neek.conv13.conv.1.bias", "neek.conv13.conv.1.running_mean", "neek.conv13.conv.1.running_var", "neek.conv13.conv.1.num_batches_tracked", "neek.conv14.conv.0.weight", "neek.conv14.conv.1.weight", "neek.conv14.conv.1.bias", "neek.conv14.conv.1.running_mean", "neek.conv14.conv.1.running_var", "neek.conv14.conv.1.num_batches_tracked", "neek.conv15.conv.0.weight", "neek.conv15.conv.1.weight", "neek.conv15.conv.1.bias", "neek.conv15.conv.1.running_mean", "neek.conv15.conv.1.running_var", "neek.conv15.conv.1.num_batches_tracked", "neek.conv16.conv.0.weight", "neek.conv16.conv.1.weight", "neek.conv16.conv.1.bias", "neek.conv16.conv.1.running_mean", "neek.conv16.conv.1.running_var", "neek.conv16.conv.1.num_batches_tracked", "neek.conv17.conv.0.weight", "neek.conv17.conv.1.weight", "neek.conv17.conv.1.bias", "neek.conv17.conv.1.running_mean", "neek.conv17.conv.1.running_var", "neek.conv17.conv.1.num_batches_tracked", "neek.conv18.conv.0.weight", "neek.conv18.conv.1.weight", "neek.conv18.conv.1.bias", "neek.conv18.conv.1.running_mean", "neek.conv18.conv.1.running_var", "neek.conv18.conv.1.num_batches_tracked", "neek.conv19.conv.0.weight", "neek.conv19.conv.1.weight", "neek.conv19.conv.1.bias", "neek.conv19.conv.1.running_mean", "neek.conv19.conv.1.running_var", "neek.conv19.conv.1.num_batches_tracked", "neek.conv20.conv.0.weight", "neek.conv20.conv.1.weight", "neek.conv20.conv.1.bias", "neek.conv20.conv.1.running_mean", "neek.conv20.conv.1.running_var", "neek.conv20.conv.1.num_batches_tracked". 
mfoglio commented 2 years ago

Fixed with this: https://github.com/Tianxiaomo/pytorch-YOLOv4/issues/135#issuecomment-652790928 . Now I am getting the error:

The model expects input shape:  ['batch_size', 3, 608, 608]
Traceback (most recent call last):
  File "demo_pytorch2onnx.py", line 96, in <module>
    main(weight_file, image_path, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 81, in main
    detect(session, image_src)
TypeError: detect() missing 1 required positional argument: 'namesfile'
mfoglio commented 2 years ago

Fixed easily by replacing line 81 with: detect(session, image_src, "data/coco.names"). Hope this helps! I'll do a PR as soon as I have time ;)

cotyyang commented 2 years ago

Update: I think the wget command was not downloading the yolo weights appropriately. Downloading them manually seems to fix the error above. But now I have the following error:

Converting to onnx and running demo ...
Traceback (most recent call last):
  File "demo_pytorch2onnx.py", line 96, in <module>
    main(weight_file, image_path, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 72, in main
    transform_to_onnx(weight_file, batch_size, n_classes, IN_IMAGE_H, IN_IMAGE_W)
  File "demo_pytorch2onnx.py", line 20, in transform_to_onnx
    model.load_state_dict(pretrained_dict)
  File "/src/pytorch-YOLOv4/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 829, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Yolov4:
        Missing key(s) in state_dict: "neck.conv1.conv.0.weight", "neck.conv1.conv.1.weight", "neck.conv1.conv.1.bias", "neck.conv1.conv.1.running_mean", "neck.conv1.conv.1.running_var", "neck.conv2.conv.0.weight", "neck.conv2.conv.1.weight", "neck.conv2.conv.1.bias", "neck.conv2.conv.1.running_mean", "neck.conv2.conv.1.running_var", "neck.conv3.conv.0.weight", "neck.conv3.conv.1.weight", "neck.conv3.conv.1.bias", "neck.conv3.conv.1.running_mean", "neck.conv3.conv.1.running_var", "neck.conv4.conv.0.weight", "neck.conv4.conv.1.weight", "neck.conv4.conv.1.bias", "neck.conv4.conv.1.running_mean", "neck.conv4.conv.1.running_var", "neck.conv5.conv.0.weight", "neck.conv5.conv.1.weight", "neck.conv5.conv.1.bias", "neck.conv5.conv.1.running_mean", "neck.conv5.conv.1.running_var", "neck.conv6.conv.0.weight", "neck.conv6.conv.1.weight", "neck.conv6.conv.1.bias", "neck.conv6.conv.1.running_mean", "neck.conv6.conv.1.running_var", "neck.conv7.conv.0.weight", "neck.conv7.conv.1.weight", "neck.conv7.conv.1.bias", "neck.conv7.conv.1.running_mean", "neck.conv7.conv.1.running_var", "neck.conv8.conv.0.weight", "neck.conv8.conv.1.weight", "neck.conv8.conv.1.bias", "neck.conv8.conv.1.running_mean", "neck.conv8.conv.1.running_var", "neck.conv9.conv.0.weight", "neck.conv9.conv.1.weight", "neck.conv9.conv.1.bias", "neck.conv9.conv.1.running_mean", "neck.conv9.conv.1.running_var", "neck.conv10.conv.0.weight", "neck.conv10.conv.1.weight", "neck.conv10.conv.1.bias", "neck.conv10.conv.1.running_mean", "neck.conv10.conv.1.running_var", "neck.conv11.conv.0.weight", "neck.conv11.conv.1.weight", "neck.conv11.conv.1.bias", "neck.conv11.conv.1.running_mean", "neck.conv11.conv.1.running_var", "neck.conv12.conv.0.weight", "neck.conv12.conv.1.weight", "neck.conv12.conv.1.bias", "neck.conv12.conv.1.running_mean", "neck.conv12.conv.1.running_var", "neck.conv13.conv.0.weight", "neck.conv13.conv.1.weight", "neck.conv13.conv.1.bias", "neck.conv13.conv.1.running_mean", "neck.conv13.conv.1.running_var", "neck.conv14.conv.0.weight", "neck.conv14.conv.1.weight", "neck.conv14.conv.1.bias", "neck.conv14.conv.1.running_mean", "neck.conv14.conv.1.running_var", "neck.conv15.conv.0.weight", "neck.conv15.conv.1.weight", "neck.conv15.conv.1.bias", "neck.conv15.conv.1.running_mean", "neck.conv15.conv.1.running_var", "neck.conv16.conv.0.weight", "neck.conv16.conv.1.weight", "neck.conv16.conv.1.bias", "neck.conv16.conv.1.running_mean", "neck.conv16.conv.1.running_var", "neck.conv17.conv.0.weight", "neck.conv17.conv.1.weight", "neck.conv17.conv.1.bias", "neck.conv17.conv.1.running_mean", "neck.conv17.conv.1.running_var", "neck.conv18.conv.0.weight", "neck.conv18.conv.1.weight", "neck.conv18.conv.1.bias", "neck.conv18.conv.1.running_mean", "neck.conv18.conv.1.running_var", "neck.conv19.conv.0.weight", "neck.conv19.conv.1.weight", "neck.conv19.conv.1.bias", "neck.conv19.conv.1.running_mean", "neck.conv19.conv.1.running_var", "neck.conv20.conv.0.weight", "neck.conv20.conv.1.weight", "neck.conv20.conv.1.bias", "neck.conv20.conv.1.running_mean", "neck.conv20.conv.1.running_var". 
        Unexpected key(s) in state_dict: "neek.conv1.conv.0.weight", "neek.conv1.conv.1.weight", "neek.conv1.conv.1.bias", "neek.conv1.conv.1.running_mean", "neek.conv1.conv.1.running_var", "neek.conv1.conv.1.num_batches_tracked", "neek.conv2.conv.0.weight", "neek.conv2.conv.1.weight", "neek.conv2.conv.1.bias", "neek.conv2.conv.1.running_mean", "neek.conv2.conv.1.running_var", "neek.conv2.conv.1.num_batches_tracked", "neek.conv3.conv.0.weight", "neek.conv3.conv.1.weight", "neek.conv3.conv.1.bias", "neek.conv3.conv.1.running_mean", "neek.conv3.conv.1.running_var", "neek.conv3.conv.1.num_batches_tracked", "neek.conv4.conv.0.weight", "neek.conv4.conv.1.weight", "neek.conv4.conv.1.bias", "neek.conv4.conv.1.running_mean", "neek.conv4.conv.1.running_var", "neek.conv4.conv.1.num_batches_tracked", "neek.conv5.conv.0.weight", "neek.conv5.conv.1.weight", "neek.conv5.conv.1.bias", "neek.conv5.conv.1.running_mean", "neek.conv5.conv.1.running_var", "neek.conv5.conv.1.num_batches_tracked", "neek.conv6.conv.0.weight", "neek.conv6.conv.1.weight", "neek.conv6.conv.1.bias", "neek.conv6.conv.1.running_mean", "neek.conv6.conv.1.running_var", "neek.conv6.conv.1.num_batches_tracked", "neek.conv7.conv.0.weight", "neek.conv7.conv.1.weight", "neek.conv7.conv.1.bias", "neek.conv7.conv.1.running_mean", "neek.conv7.conv.1.running_var", "neek.conv7.conv.1.num_batches_tracked", "neek.conv8.conv.0.weight", "neek.conv8.conv.1.weight", "neek.conv8.conv.1.bias", "neek.conv8.conv.1.running_mean", "neek.conv8.conv.1.running_var", "neek.conv8.conv.1.num_batches_tracked", "neek.conv9.conv.0.weight", "neek.conv9.conv.1.weight", "neek.conv9.conv.1.bias", "neek.conv9.conv.1.running_mean", "neek.conv9.conv.1.running_var", "neek.conv9.conv.1.num_batches_tracked", "neek.conv10.conv.0.weight", "neek.conv10.conv.1.weight", "neek.conv10.conv.1.bias", "neek.conv10.conv.1.running_mean", "neek.conv10.conv.1.running_var", "neek.conv10.conv.1.num_batches_tracked", "neek.conv11.conv.0.weight", "neek.conv11.conv.1.weight", "neek.conv11.conv.1.bias", "neek.conv11.conv.1.running_mean", "neek.conv11.conv.1.running_var", "neek.conv11.conv.1.num_batches_tracked", "neek.conv12.conv.0.weight", "neek.conv12.conv.1.weight", "neek.conv12.conv.1.bias", "neek.conv12.conv.1.running_mean", "neek.conv12.conv.1.running_var", "neek.conv12.conv.1.num_batches_tracked", "neek.conv13.conv.0.weight", "neek.conv13.conv.1.weight", "neek.conv13.conv.1.bias", "neek.conv13.conv.1.running_mean", "neek.conv13.conv.1.running_var", "neek.conv13.conv.1.num_batches_tracked", "neek.conv14.conv.0.weight", "neek.conv14.conv.1.weight", "neek.conv14.conv.1.bias", "neek.conv14.conv.1.running_mean", "neek.conv14.conv.1.running_var", "neek.conv14.conv.1.num_batches_tracked", "neek.conv15.conv.0.weight", "neek.conv15.conv.1.weight", "neek.conv15.conv.1.bias", "neek.conv15.conv.1.running_mean", "neek.conv15.conv.1.running_var", "neek.conv15.conv.1.num_batches_tracked", "neek.conv16.conv.0.weight", "neek.conv16.conv.1.weight", "neek.conv16.conv.1.bias", "neek.conv16.conv.1.running_mean", "neek.conv16.conv.1.running_var", "neek.conv16.conv.1.num_batches_tracked", "neek.conv17.conv.0.weight", "neek.conv17.conv.1.weight", "neek.conv17.conv.1.bias", "neek.conv17.conv.1.running_mean", "neek.conv17.conv.1.running_var", "neek.conv17.conv.1.num_batches_tracked", "neek.conv18.conv.0.weight", "neek.conv18.conv.1.weight", "neek.conv18.conv.1.bias", "neek.conv18.conv.1.running_mean", "neek.conv18.conv.1.running_var", "neek.conv18.conv.1.num_batches_tracked", "neek.conv19.conv.0.weight", "neek.conv19.conv.1.weight", "neek.conv19.conv.1.bias", "neek.conv19.conv.1.running_mean", "neek.conv19.conv.1.running_var", "neek.conv19.conv.1.num_batches_tracked", "neek.conv20.conv.0.weight", "neek.conv20.conv.1.weight", "neek.conv20.conv.1.bias", "neek.conv20.conv.1.running_mean", "neek.conv20.conv.1.running_var", "neek.conv20.conv.1.num_batches_tracked". 

hi, I also encountered this problem, but I think the best solution is to change self.neck to self.neek in models.py.