NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.03k stars 287 forks source link

cuDNN error: CUDNN_STATUS_INTERNAL_ERROR #211

Closed lkaesberg closed 2 years ago

lkaesberg commented 2 years ago

Hello, I tried to get dope running to track objects with my webcam. I can load the model but when I try to detect the pose i get an cuDNN error:

Traceback (most recent call last):
  File "D:/Informatik/Uni/deeplabcut_tools/src/evaluate_model.py", line 180, in <module>
    dope_result = ObjectDetector.detect_object_in_image(dope_model.net, dope_pnp_solver, rgb_image, dope_config)
  File "D:\Informatik\Uni\deeplabcut_tools\src\dope\inference\detector.py", line 256, in detect_object_in_image
    out, seg = net_model(image_torch)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\parallel\data_parallel.py", line 165, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\Informatik\Uni\deeplabcut_tools\src\dope\inference\detector.py", line 93, in forward
    out1 = self.vgg(x)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\container.py", line 119, in forward
    input = module(input)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\conv.py", line 399, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "D:\Informatik\Uni\deeplabcut_tools\venv\lib\site-packages\torch\nn\modules\conv.py", line 396, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

When I tried to find a solution to the problem, some posts suggested to use only the cpu because that gives better error messages. But the model won't allow me to run it on cpu only. When I try that, it says torch needs to be compiled with CUDA.

Traceback (most recent call last):
  File "D:/Informatik/Uni/dope/src/dope/inference/infer.py", line 268, in <module>
    infer.load_config(config_name)
  File "D:/Informatik/Uni/dope/src/dope/inference/infer.py", line 126, in load_config
    self.models[model].load_net_model()
  File "D:\Informatik\Uni\dope\src\dope\inference\detector.py", line 214, in load_net_model
    self.net = self.load_net_model_path(self.net_path)
  File "D:\Informatik\Uni\dope\src\dope\inference\detector.py", line 225, in load_net_model_path
    net = torch.nn.DataParallel(net, [0]).cuda()
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 458, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 354, in _apply
    module._apply(fn)
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 354, in _apply
    module._apply(fn)
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 354, in _apply
    module._apply(fn)
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 376, in _apply
    param_applied = fn(param)
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\nn\modules\module.py", line 458, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\cuda\__init__.py", line 186, in _lazy_init
    _check_driver()
  File "C:\Users\Larsk\anaconda3\envs\yolov4-gpu\lib\site-packages\torch\cuda\__init__.py", line 61, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

My code to detect the object:

    dope_pnp_solver = CuboidPNPSolver(
                dope_model,
                mat["intrinsic_matrix"],
                Cuboid3d([8.918299674987793, 7.311500072479248, 2.9983000755310059]),
                dist_coeffs=dist_coeffs)

    dope_model = ModelData(name="Gelatin", net_path="./gelatin_60.pth")
    dope_model.load_net_model()

    dope_config = lambda: None
    dope_config.mask_edges = 1
    dope_config.mask_faces = 1
    dope_config.vertex = 1
    dope_config.threshold = 0.5
    dope_config.softmax = 1000
    dope_config.thresh_angle = 0.5
    dope_config.thresh_map = 0.01
    dope_config.sigma = 3
    dope_config.thresh_points = 0.1

    rgb_image = cv2.imread("{}/{}-color.png".format(args.dataset_path, dataset_file))

    dope_result = ObjectDetector.detect_object_in_image(dope_model.net, dope_pnp_solver, rgb_image, dope_config)

Does anyone has an idea what is going wrong?