Unity-Technologies / Robotics-Object-Pose-Estimation

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
Apache License 2.0
293 stars 75 forks source link

Pose Estimation not working correctly #45

Closed tensarflow closed 2 years ago

tensarflow commented 2 years ago

Describe the bug

The pose estimation is not executed correctly. I get an error regarding model weights and input not being on the same device. When I change this line to this

    device = torch.device("cpu")

it works fine.

To Reproduce

Used the demo Unity project, therefore did not everything in the 4 readme's.

Console logs / stack traces

[ERROR] [1640807467.034139]: Error processing request: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
['Traceback (most recent call last):\n', '  File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_service.py", line 633, in _handle_request\n    response = convert_return_to_response(self.handler(request), self.response_class)\n', '  File "/home/ensar/Robotics-Object-Pose-Estimation/ROS/src/ur3_moveit/scripts/pose_estimation_script.py", line 96, in pose_estimation_main\n    est_position, est_rotation = _run_model(image_path)\n', '  File "/home/ensar/Robotics-Object-Pose-Estimation/ROS/src/ur3_moveit/scripts/pose_estimation_script.py", line 52, in _run_model\n    output = run_model_main(image_path, MODEL_PATH)\n', '  File "/home/ensar/Robotics-Object-Pose-Estimation/ROS/src/ur3_moveit/src/ur3_moveit/setup_and_run_model.py", line 138, in run_model_main\n    output_translation, output_orientation = model(torch.stack(image).reshape(-1, 3, 224, 224))\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl\n    result = self.forward(*input, **kwargs)\n', '  File "/home/ensar/Robotics-Object-Pose-Estimation/ROS/src/ur3_moveit/src/ur3_moveit/setup_and_run_model.py", line 54, in forward\n    x = self.model_backbone(x)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl\n    result = self.forward(*input, **kwargs)\n', '  File "/usr/local/lib/python3.8/dist-packages/torchvision/models/vgg.py", line 43, in forward\n    x = self.features(x)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl\n    result = self.forward(*input, **kwargs)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py", line 117, in forward\n    input = module(input)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl\n    result = self.forward(*input, **kwargs)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 423, in forward\n    return self._conv_forward(input, self.weight)\n', '  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 419, in _conv_forward\n    return F.conv2d(input, weight, self.bias, self.stride,\n', 'RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same\n']

Expected behavior

A working pose estimation.

Environment (please complete the following information, where applicable):

at669 commented 2 years ago

Thanks for the report! I've filed an internal ticket for this, and the team will look into it.

[Ticket#: AIRO-1654]

JonathanLeban commented 2 years ago

Hello, I am not able to reproduce your error but a suggestion. Change the line 136 of https://github.com/Unity-Technologies/Robotics-Object-Pose-Estimation/blob/main/ROS/src/ur3_moveit/src/ur3_moveit/setup_and_run_model.py into the following: output_translation, output_orientation = model(torch.stack(image).reshape(-1, 3, 224, 224).to(device))

I hope it will solve the issue. The change you made is also fine as it will force the inference of the network to be done by the CPU and not the GPU if the computer you are running the program on has one, and the network is pretty light so the inference time will be roughly the same in both cases.

adakoda commented 2 years ago

I had the same problem and got same error messages.

'RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

This means that Input type of image is torch.cuda.FloatTensor, but model is torch.FloatTensor. Should we call model.to(device) after calling model.load_state_dict() ?

Official document:

After model return torch.cuda.FloatTensor, we also need to call cpu() before calling detach().

output_translation, output_orientation = output_translation.cpu().detach().numpy(), output_orientation.cpu().detach().numpy()

JonathanLeban commented 2 years ago

Thank you @adakoda for your report. I will make the changes.

at669 commented 2 years ago

Thanks @JonathanLeban for the PR linked above!

This fix has been merged into the dev branch and will later be merged into main--please feel free to pull the branch locally to verify that it resolves your issue. I will close the ticket out for now, but go ahead and reopen it if the problem persists. Thanks all!