tomasvr / turtlebot3_drlnav

A ROS2-based framework for TurtleBot3 DRL autonomous navigation
150 stars 27 forks source link

Erorr when run on real turtlebot: mat1 and mat2 shapes cannot be multiplied #6

Closed LihanChen2004 closed 8 months ago

LihanChen2004 commented 1 year ago

Problem:

When attempting to run the example code in real turtlebot within a Docker container, I encountered the error: mat1 and mat2 shapes cannot be multiplied

My run steps

I run ros2 launch turtlebot3_bringup robot.launch.py on the turtlebot onboard computer to publish imu, odom, etc., when I use the sample model ddpg_0_stage9 in the warehouse or my own trained model.

run ros2 run turtlebot3_drl real_environment, ros2 run turtlebot3_drl real_agent ddpg ddpg_1_stage_4 100 on another computer on the LAN.

Prompt me

gpu torch available: True
device name:  NVIDIA GeForce RTX 3070 Ti Laptop GPU
testing on stage: 4
loading: actor model from file:  /home/turtlebot3_drlnav/src/turtlebot3_drl/model/lihanchen/ddpg_1_stage_4/actor_stage4_episode100.pt
loading: target_actor model from file:  /home/turtlebot3_drlnav/src/turtlebot3_drl/model/lihanchen/ddpg_1_stage_4/target_actor_stage4_episode100.pt
loading: critic model from file:  /home/turtlebot3_drlnav/src/turtlebot3_drl/model/lihanchen/ddpg_1_stage_4/critic_stage4_episode100.pt
loading: target_critic model from file:  /home/turtlebot3_drlnav/src/turtlebot3_drl/model/lihanchen/ddpg_1_stage_4/target_critic_stage4_episode100.pt
global steps: 79628
loaded model ddpg_1_stage_4 (eps 100): 128, 1000000, 44, 2, 512, 0.99, 0.003, 0.003, 0.01, A, False, False, 3, 4
Waiting for new goal...  (if persists: reset gazebo_goals node)
Waiting for new goal...  (if persists: reset gazebo_goals node)```

So I publish the target by ./spawn_goal.sh 1 1. Then the second node reported an error

  Traceback (most recent call last):
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/turtlebot3_drl/real_agent", line 11, in <module>
    load_entry_point('turtlebot3-drl==2.0.0', 'console_scripts', 'real_agent')()
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/drl_agent.py", line 215, in main_real
    main(args)
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/drl_agent.py", line 200, in main
    drl_agent = DrlAgent(*args)
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/drl_agent.py", line 108, in __init__
    self.process()
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/drl_agent.py", line 133, in process
    action = self.model.get_action(state, self.training, step, ENABLE_VISUAL)
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/ddpg.py", line 86, in get_action
    action = self.actor(state, visualize)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/turtlebot3_drlnav/install/turtlebot3_drl/lib/python3.8/site-packages/turtlebot3_drl/drl_agent/ddpg.py", line 34, in forward
    x1 = torch.relu(self.fa1(states))
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 1848, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x364 and 44x512)

Environment:

Host OS: [Ubuntu20.04] Docker version: [24.0.5] Host Cuda: [11.6]

I would like to know if other people have had similar problems. What is causing this problem? What needs to be done? I would appreciate it very much if I could get it.

LihanChen2004 commented 1 year ago

I have found the reason for the matrix multiplication error.

In order to adapt to real turbobots, I set REAL_N_SCAN_SAMPLES changed from 40 to 360 in setting.py. Then resulting in matrix multiplication error:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x364 and 44x512)

Similarly, set REAL_N_SCAN_SAMPLES to 180, an error is reported:

RuntimeError: mat1 and mat2 shapes cannot be multiplexed (1x184 and 44x512)

May I know how to modify the training model from 44 x 512 to 364 × 512 ?

LihanChen2004 commented 12 months ago

I have basically solved the problem through #2 .

tomasvr commented 10 months ago

The number of scan samples you use in simulation (NUM_SCAN_SAMPLES) should be equal to the number of scan samples you want to provide through the real world lidar (REAL_N_SCAN_SAMPLES). I have added a note for this in commit 6a4a948ea3524ce304aef08c94fb5fd4178119d1.