hjj-666 commented 2 years ago

150096176-4c83e131-78a1-40d5-9d9c-31ff02606f00

hjj-666 commented 2 years ago

This is a screenshot of another project I am running. The address of the project is https://github.com/ZJU-Robotics-Lab/model-based-social-navigation. When I run its project, I run it in the conda environment Such an error will occur, but no error is reported when it is not running in the conda environment. The general solution on the Internet is to turn off the conda environment. But I can run your project directly in the conda environment, may I ask if your project has had similar problems before to avoid such problems

reiniscimurs commented 2 years ago

Seems to me that you are running into problems with dependencies or Python versions. From the output here it seems like you are calling ROS kinetic with python2 but your conda runs python3 and calling conda 3 packages. If you are running Kinetic, you should be using python2. Hard to say what the difference is between my implementation and theirs as I do not not how their code works. Perhaps the explicit call of python3, but I do not know for sure. In any case you should check if you have all packages installed in the conda environment and check if the python versions match throughout your pipeline.

hjj-666 commented 2 years ago

Thank you for your reply, I don't think it should be a problem with the package in conda. I directly import re in conda's python and no error is reported

reiniscimurs commented 2 years ago

In that case I would suggest checking if both (python2 and python3) have this package installed and the paths properly set up in conda.

hjj-666 commented 2 years ago

Both python2 and python3 should have this package installed, and the path of conda has not changed as before.

hjj-666 commented 2 years ago

Another question is, have you tested the effect of other reinforcement learning algorithms such as DQN, DDPG, etc. in this simulation

reiniscimurs commented 2 years ago

There should be no issue with training other types of architectures with this simulation, as the simulation is there to only gather samples and execute actions. The learning is a separate function from the simulation. Actually, I have tried both (DQN and DDPG) within this framework in one way or another, but the best architecture depends on what you need. DQN gives discreet outputs. You can either map the outputs to velocities within your training function or deal with it directly in the simulation framework. But it is using discreet action space, so I prefer the TD3. DDPG is fine, it uses less resources in the training (because it has one less critic), but it is quite unstable due to overestimating the Q-value. TD3 is essentially designed to deal with this issue and is more or less an updated, more stable and better version of DDPG.

hjj-666 commented 2 years ago

Thanks for your reply, I will also try reinforcement learning algorithms like DDPG later

hjj-666 commented 2 years ago

Hello, may I ask if you have added moving pedestrians in gazebo before, is there anything you can refer to in this project?

reiniscimurs commented 2 years ago

There are 2 ways that I can think of off the top of my head that could simulate at least dynamic obstacles in the environment.

Add another moving agent (robot) and write a control script for it.
Have a model of an object, and reset its location according to some logic at the start of every step. In the learning data this would appear as if the obstacle is dynamic. In this repo the location of the boxes is randomly set at the start of each episode with function at: https://github.com/reiniscimurs/DRL-robot-navigation/blob/be811a4050dfcb5c800a0a0e4000be81d48cfbc5/TD3/velodyne_env.py#L403 You could use a similar function to set the desired location at every step instead of any model you want with any kind of movement logic.

hjj-666 commented 2 years ago

Thank you for your reply

hjj-666 commented 2 years ago

659949BFC9468F1B9295EF21E40B1651

hjj-666 commented 2 years ago

Hello, recently I found out that you can use an actor model to create moving pedestrians, but this requires gazebo version 8 or above, so after I installed gazebo8 and ran your project, the problem in the picture appeared, do you have any way to solve

reiniscimurs commented 2 years ago

ROS Kinetic supports Gazebo 7. For newer versions of Gazebo I would suggest using a newer version of ROS (Melodic or Noetic), that should solve the issue with unsupported packages and different naming conventions. Seems to me here that Gazebo is looking for something in the TF2 ROS package that is not supported by the TF2 package version that you have installed and you have a version mismatch.

hjj-666 commented 2 years ago

Thank you for your reply, I found some information and saw that some people have successfully installed gazebo8 on ubuntu16.04. If it still doesn't work, I will try to upgrade the ubuntu version to 18.04

reiniscimurs commented 2 years ago

Yes, you can probably run gazebo8 on Ubuntu 16, but the issue here is a mismatch between the ROS and gazebo. In any case, there is probably a way to do it by changing packages/editing source code/compiling packages from source, but it most likely will not work "out-of-the-box". If you do not have any very specific reason why you would need to stick with older versions of ROS, the better way is probably to update the ROS pipeline.

hjj-666 commented 2 years ago

Hello, I recently tried to use the DDPG algorithm used in your first version of the paper to compare with the TD3 algorithm, but the DDPG algorithm I modified, after training, the car will appear in the same place to explore and not move forward. Excuse me How to solve this, the link of the DDPG algorithm I refer to is as follows:https://github.com/sfujim/TD3. ddpg.py and ourddpg.py in the link. May I ask how you used the ddpg algorithm to simulate this project?

reiniscimurs commented 2 years ago

Sorry, I do not understand what the following means:

after training, the car will appear in the same place to explore and not move forward Can you explain your issue in more detail?

Their DDPG looks fine and should train something, but that is not part of this repo and done by someone else so I wouldn't know about any issues regarding it. I also do not know how you implemented it, so it is impossible to answer such general questions. Please be way more specific.

hjj-666 commented 2 years ago

https://user-images.githubusercontent.com/83230521/165894618-f5c3fd94-f0a2-4745-9397-fedc97837066.mp4

hjj-666 commented 2 years ago

At present, it has been trained for 150 verification sessions, but it often occurs as shown in this video

reiniscimurs commented 2 years ago

Again, I do not know your implementation so I do not know how to solve this for your situation. But I can give a high level overview of what I think is happening here.

It appears to me that the issue is that you have struck a sort of local optimum. You will see that the goal point is located right behind the robot, so what happens is:

The robot has a high polar angular input (meaning it has turned away from the goal) starts turning left.
The angular input flips over (goes from 3.14 to -3.14)
The robot has a low polar angular input (meaning it has turned away from the goal to the other side) starts turning right.
The angular input flips over (goes from -3.14 to 3.14)

You can see that it has been caught in a loop. Since all the decision in the neural network are independent and there is no history, the network has no way of knowing it is stuck or "frozen". So it just repeats the best actions it has calculated so far.

hjj-666 commented 2 years ago

I tried ddpg in three ways as follows. did not produce good results

hjj-666 commented 2 years ago

class Actor(nn.Module):

def __init__(self, state_dim, action_dim, max_action):
    super(Actor, self).__init__()

    self.layer_1 = nn.Linear(state_dim, 800)
    self.layer_2 = nn.Linear(800, 600)
    self.layer_3 = nn.Linear(600, action_dim)
    self.max_action = max_action
    self.soft = nn.Softsign()
    self.tanh = nn.Tanh()
    self.sig = nn.Sigmoid()

def forward(self, s):
    s = F.relu(self.layer_1(s))
    s = F.relu(self.layer_2(s))
    a = self.tanh(self.layer_3(s))
    return a

class Critic(nn.Module):

def __init__(self, state_dim, action_dim):
    super(Critic, self).__init__()

    self.layer_1 = nn.Linear(state_dim , 800)
    self.layer_2 = nn.Linear(800 + action_dim, 600)
    #self.layer_2_a = nn.Linear(action_dim, 600)
    self.layer_3 = nn.Linear(600, 1)

    #self.layer_4 = nn.Linear(state_dim + action_dim, 800)
    #self.layer_5 = nn.Linear(800, 600)
    #self.layer_5_a = nn.Linear(action_dim, 600)
    #self.layer_6 = nn.Linear(600, 1)

def forward(self, s, a):
    #sa = torch.cat([s, a], 1)
    q = F.relu(self.layer_1(s))
    q = F.relu(self.layer_2(torch.cat([q, a], 1)))
    q = self.layer_3(q)
    # s1 = F.relu(self.layer_1(s))
    # self.layer_2_s(s1)
    # self.layer_2_a(a)
    # s11 = torch.mm(s1, self.layer_2_s.weight.data.t())
    # s12 = torch.mm(a, self.layer_2_a.weight.data.t())
    # s1 = F.relu(s11 + s12 + self.layer_2_a.bias.data)
    #q1 = self.layer_3(s1)

    # s2 = F.relu(self.layer_4(s))
    # self.layer_5_s(s2)
    # self.layer_5_a(a)
    # s21 = torch.mm(s2, self.layer_5_s.weight.data.t())
    # s22 = torch.mm(a, self.layer_5_a.weight.data.t())
    # s2 = F.relu(s21 + s22 + self.layer_5_a.bias.data)
    # q2 = self.layer_6(s2)
    # q2 = F.relu(self.layer_4(sa))
    # q2 = F.relu(self.layer_5(q2))
    # q2 = self.layer_6(q2)
    return q

TD3 network

class DDPG(object):

def __init__(self, state_dim, action_dim, max_action):
    # Initialize the Actor network
    self.actor = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target.load_state_dict(self.actor.state_dict())
    self.actor_optimizer = torch.optim.Adam(self.actor.parameters())

    # Initialize the Critic networks
    self.critic = Critic(state_dim, action_dim).to(device)
    self.critic_target = Critic(state_dim, action_dim).to(device)
    self.critic_target.load_state_dict(self.critic.state_dict())
    self.critic_optimizer = torch.optim.Adam(self.critic.parameters())

    self.max_action = max_action

# Function to get the action from the actor
def get_action(self, state):
    state = torch.Tensor(state.reshape(1, -1)).to(device)
    return self.actor(state).cpu().data.numpy().flatten()

# training cycle
def train(self, replay_buffer, iterations, batch_size=100, discount=1, tau=0.005, policy_noise=0.2,  # discount=0.99
          noise_clip=0.5, policy_freq=2):
    for it in range(iterations):
        # print(it)
        # sample a batch from the replay buffer
        batch_states, batch_actions, batch_rewards, batch_dones, batch_next_states = replay_buffer.sample_batch(
            batch_size)
        state = torch.Tensor(batch_states).to(device)
        next_state = torch.Tensor(batch_next_states).to(device)
        action = torch.Tensor(batch_actions).to(device)
        reward = torch.Tensor(batch_rewards).to(device)
        done = torch.Tensor(batch_dones).to(device)

        next_action = self.actor_target(next_state)

        # Calculate the Q values from the critic-target network for the next state-action pair
        target_Q = self.critic_target(next_state, next_action)

        # Calculate the final Q value from the target network parameters by using Bellman equation
        target_Q = reward + ((1 - done) * discount * target_Q).detach()

        # Get the Q values of the basis networks with the current parameters
        current_Q = self.critic(state, action)

        # Calculate the loss between the current Q value and the target Q value
        loss = F.mse_loss(current_Q, target_Q) 
        # loss = torch.sum(abs((target_Q-current_Q1)/target_Q) + abs((target_Q-current_Q2)/target_Q))

        # Perform the gradient descent
        self.critic_optimizer.zero_grad()
        loss.backward()
        self.critic_optimizer.step()

        actor_grad = self.critic(state, self.actor(state))
        actor_grad = -actor_grad.mean()
        self.actor_optimizer.zero_grad()
        actor_grad.backward()
        self.actor_optimizer.step()

        # Use soft update to update the actor-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.actor.parameters(), self.actor_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)
        # Use soft update to update the critic-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.critic.parameters(), self.critic_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)

hjj-666 commented 2 years ago

class Actor(nn.Module):

def __init__(self, state_dim, action_dim, max_action):
    super(Actor, self).__init__()

    self.layer_1 = nn.Linear(state_dim, 800)
    self.layer_2 = nn.Linear(800, 600)
    self.layer_3 = nn.Linear(600, action_dim)
    self.max_action = max_action
    self.soft = nn.Softsign()
    self.tanh = nn.Tanh()
    self.sig = nn.Sigmoid()

def forward(self, s):
    s = F.relu(self.layer_1(s))
    s = F.relu(self.layer_2(s))
    a = self.tanh(self.layer_3(s))
    return a

class Critic(nn.Module):

def __init__(self, state_dim, action_dim):
    super(Critic, self).__init__()

    self.layer_1 = nn.Linear(state_dim + action_dim, 800)
    self.layer_2 = nn.Linear(800 , 600)
    #self.layer_2_a = nn.Linear(action_dim, 600)
    self.layer_3 = nn.Linear(600, 1)

    #self.layer_4 = nn.Linear(state_dim + action_dim, 800)
    #self.layer_5 = nn.Linear(800, 600)
    #self.layer_5_a = nn.Linear(action_dim, 600)
    #self.layer_6 = nn.Linear(600, 1)

def forward(self, s, a):
    #sa = torch.cat([s, a], 1)
    q = F.relu(self.layer_1(torch.cat([s, a], 1)))
    q = F.relu(self.layer_2(q))
    q = self.layer_3(q)
    # s1 = F.relu(self.layer_1(s))
    # self.layer_2_s(s1)
    # self.layer_2_a(a)
    # s11 = torch.mm(s1, self.layer_2_s.weight.data.t())
    # s12 = torch.mm(a, self.layer_2_a.weight.data.t())
    # s1 = F.relu(s11 + s12 + self.layer_2_a.bias.data)
    #q1 = self.layer_3(s1)

    # s2 = F.relu(self.layer_4(s))
    # self.layer_5_s(s2)
    # self.layer_5_a(a)
    # s21 = torch.mm(s2, self.layer_5_s.weight.data.t())
    # s22 = torch.mm(a, self.layer_5_a.weight.data.t())
    # s2 = F.relu(s21 + s22 + self.layer_5_a.bias.data)
    # q2 = self.layer_6(s2)
    # q2 = F.relu(self.layer_4(sa))
    # q2 = F.relu(self.layer_5(q2))
    # q2 = self.layer_6(q2)
    return q

TD3 network

class DDPG(object):

def __init__(self, state_dim, action_dim, max_action):
    # Initialize the Actor network
    self.actor = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target.load_state_dict(self.actor.state_dict())
    self.actor_optimizer = torch.optim.Adam(self.actor.parameters())

    # Initialize the Critic networks
    self.critic = Critic(state_dim, action_dim).to(device)
    self.critic_target = Critic(state_dim, action_dim).to(device)
    self.critic_target.load_state_dict(self.critic.state_dict())
    self.critic_optimizer = torch.optim.Adam(self.critic.parameters())

    self.max_action = max_action

# Function to get the action from the actor
def get_action(self, state):
    state = torch.Tensor(state.reshape(1, -1)).to(device)
    return self.actor(state).cpu().data.numpy().flatten()

# training cycle
def train(self, replay_buffer, iterations, batch_size=100, discount=1, tau=0.005, policy_noise=0.2,  # discount=0.99
          noise_clip=0.5, policy_freq=2):
    for it in range(iterations):
        # print(it)
        # sample a batch from the replay buffer
        batch_states, batch_actions, batch_rewards, batch_dones, batch_next_states = replay_buffer.sample_batch(
            batch_size)
        state = torch.Tensor(batch_states).to(device)
        next_state = torch.Tensor(batch_next_states).to(device)
        action = torch.Tensor(batch_actions).to(device)
        reward = torch.Tensor(batch_rewards).to(device)
        done = torch.Tensor(batch_dones).to(device)

        next_action = self.actor_target(next_state)

        # Calculate the Q values from the critic-target network for the next state-action pair
        target_Q = self.critic_target(next_state, next_action)

        # Calculate the final Q value from the target network parameters by using Bellman equation
        target_Q = reward + ((1 - done) * discount * target_Q).detach()

        # Get the Q values of the basis networks with the current parameters
        current_Q = self.critic(state, action)

        # Calculate the loss between the current Q value and the target Q value
        loss = F.mse_loss(current_Q, target_Q) 
        # loss = torch.sum(abs((target_Q-current_Q1)/target_Q) + abs((target_Q-current_Q2)/target_Q))

        # Perform the gradient descent
        self.critic_optimizer.zero_grad()
        loss.backward()
        self.critic_optimizer.step()

        actor_grad = self.critic(state, self.actor(state))
        actor_grad = -actor_grad.mean()
        self.actor_optimizer.zero_grad()
        actor_grad.backward()
        self.actor_optimizer.step()

        # Use soft update to update the actor-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.actor.parameters(), self.actor_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)
        # Use soft update to update the critic-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.critic.parameters(), self.critic_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)

hjj-666 commented 2 years ago

class Actor(nn.Module):

def __init__(self, state_dim, action_dim, max_action):
    super(Actor, self).__init__()

    self.layer_1 = nn.Linear(state_dim, 800)
    self.layer_2 = nn.Linear(800, 600)
    self.layer_3 = nn.Linear(600, action_dim)
    self.max_action = max_action
    self.soft = nn.Softsign()
    self.tanh = nn.Tanh()
    self.sig = nn.Sigmoid()

def forward(self, s):
    s = F.relu(self.layer_1(s))
    s = F.relu(self.layer_2(s))
    a = self.tanh(self.layer_3(s))
    return a

class Critic(nn.Module):

def __init__(self, state_dim, action_dim):
    super(Critic, self).__init__()

    self.layer_1 = nn.Linear(state_dim, 800)
    self.layer_2_s = nn.Linear(800, 600)
    self.layer_2_a = nn.Linear(action_dim, 600)
    self.layer_3 = nn.Linear(600, 1)

    # self.layer_4 = nn.Linear(state_dim, 800)
    # self.layer_5_s = nn.Linear(800, 600)
    # self.layer_5_a = nn.Linear(action_dim, 600)
    # self.layer_6 = nn.Linear(600, 1)

def forward(self, s, a):
    s1 = F.relu(self.layer_1(s))
    self.layer_2_s(s1)
    self.layer_2_a(a)
    s11 = torch.mm(s1, self.layer_2_s.weight.data.t())
    s12 = torch.mm(a, self.layer_2_a.weight.data.t())
    s1 = F.relu(s11 + s12 + self.layer_2_a.bias.data)
    q1 = self.layer_3(s1)

    # s2 = F.relu(self.layer_4(s))
    # self.layer_5_s(s2)
    # self.layer_5_a(a)
    # s21 = torch.mm(s2, self.layer_5_s.weight.data.t())
    # s22 = torch.mm(a, self.layer_5_a.weight.data.t())
    # s2 = F.relu(s21 + s22 + self.layer_5_a.bias.data)
    # q2 = self.layer_6(s2)
    return q1

TD3 network

class DDPG_2(object):

def __init__(self, state_dim, action_dim, max_action):
    # Initialize the Actor network
    self.actor = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target = Actor(state_dim, action_dim, max_action).to(device)
    self.actor_target.load_state_dict(self.actor.state_dict())
    self.actor_optimizer = torch.optim.Adam(self.actor.parameters())

    # Initialize the Critic networks
    self.critic = Critic(state_dim, action_dim).to(device)
    self.critic_target = Critic(state_dim, action_dim).to(device)
    self.critic_target.load_state_dict(self.critic.state_dict())
    self.critic_optimizer = torch.optim.Adam(self.critic.parameters())

    self.max_action = max_action

# Function to get the action from the actor
def get_action(self, state):
    state = torch.Tensor(state.reshape(1, -1)).to(device)
    return self.actor(state).cpu().data.numpy().flatten()

# training cycle
def train(self, replay_buffer, iterations, batch_size=100, discount=1, tau=0.005, policy_noise=0.2,  # discount=0.99
          noise_clip=0.5, policy_freq=2):
    for it in range(iterations):
        # print(it)
        # sample a batch from the replay buffer
        batch_states, batch_actions, batch_rewards, batch_dones, batch_next_states = replay_buffer.sample_batch(
            batch_size)
        state = torch.Tensor(batch_states).to(device)
        next_state = torch.Tensor(batch_next_states).to(device)
        action = torch.Tensor(batch_actions).to(device)
        reward = torch.Tensor(batch_rewards).to(device)
        done = torch.Tensor(batch_dones).to(device)

        # Obtain the estimated action from the next state by using the actor-target
        next_action = self.actor_target(next_state)

        next_action = self.actor_target(next_state)

        # Calculate the Q values from the critic-target network for the next state-action pair
        target_Q = self.critic_target(next_state, next_action)

        # Calculate the final Q value from the target network parameters by using Bellman equation
        target_Q = reward + ((1 - done) * discount * target_Q).detach()

        # Get the Q values of the basis networks with the current parameters
        current_Q = self.critic(state, action)

        # Calculate the loss between the current Q value and the target Q value
        loss = F.mse_loss(current_Q, target_Q) 
        # loss = torch.sum(abs((target_Q-current_Q1)/target_Q) + abs((target_Q-current_Q2)/target_Q))

        # Perform the gradient descent
        self.critic_optimizer.zero_grad()
        loss.backward()
        self.critic_optimizer.step()

        actor_grad = self.critic(state, self.actor(state))
        actor_grad = -actor_grad.mean()
        self.actor_optimizer.zero_grad()
        actor_grad.backward()
        self.actor_optimizer.step()

        # Use soft update to update the actor-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.actor.parameters(), self.actor_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)
        # Use soft update to update the critic-target network parameters by infusing small amount of current parameters
        for param, target_param in zip(self.critic.parameters(), self.critic_target.parameters()):
            target_param.data.copy_(tau * param.data + (1 - tau) * target_param.data)

reiniscimurs commented 2 years ago

These all seem like decent enough implementations for DDPG. I don't think that changing the place where action is introduced in critic will be detrimental for training here though, so that alone will probably not bring a lot of benefit.

DDPG is very fickle and unstable (which is why TD3 was developed) so it might be that you need to just run the training multiple times. Have you tried using bunch of different seed values for your training?

hjj-666 commented 2 years ago

At present, I just changed the network, and did not modify the value of the random seed. Is modifying this value helpful to training?

reiniscimurs commented 2 years ago

The initial weights of the network are initialized randomly, based on your seed value. There might be occasions where the weights are settled in such a way that your step size can not escape some sort of local optimum and renders the training useless. Setting a different seed value will initialize the network weights differently, if you are lucky, in a way that can be conductive to training. This is especially important for unstable network structures.

Set seed values are used for reproducibility. But when developing a network, finding the right seed value can be just as crucial as finding the right architecture.

So in short: try bunch of different seed values. That might help.

hjj-666 commented 2 years ago

Thanks for your answer, I will try different random seeds later

hjj-666 commented 2 years ago

572AFB14DCDFF38A2B5773F3D454862F

hjj-666 commented 2 years ago

Hello, may I ask what is the reason for this error? Although it does not affect the continued operation of the car, it will always be printed on the command line.

hjj-666 commented 2 years ago

Currently I'm training in gazebo9 and ros melodic

hjj-666 commented 2 years ago

8F912ECA659D791128A8F436D44B30A0

reiniscimurs commented 2 years ago

This is a threading issue where a call is made to a None object. You can try to change the reading of the scan and odom topics from the wait_for_message type to a subscriber type as was done here: https://github.com/reiniscimurs/DRL-robot-navigation/issues/14#issuecomment-1107967435

hjj-666 commented 2 years ago

Thanks, this solution seems to work so far

hjj-666 commented 2 years ago

Hello, I want to use the data of the 16-line lidar velodyne in practice. How to change part of the code of the velodyne in your project so that it only uses the 180-degree data in the front?

reiniscimurs commented 2 years ago

The simulated device in Gazebo already uses only the 180 degrees in the front. Do you mean a physical velodyne device on a real robot? Physical implementations are not part of this repo and I can't really help you there because I do not know what kind of hardware you want to use, but you can always just record the 360 degree data and filter out the 180 degrees in front of the robot.

hjj-666 commented 2 years ago

I want to use the same 16-line velodyne lidar as the simulation, because I see that you only turn on the laser of the first 180 degrees in the simulation, and actually do not know how to read the laser data of the first 180 degrees. Another problem is that in your code, data = rospy.wait_for_message('/r1/front_laser/scan', LaserScan, timeout=0.5) acquires laser data and converts point cloud data with this function def velodyne_callback(self, v) What is the difference between the meaning of the value of the distance?

reiniscimurs commented 2 years ago

The data = rospy.wait_for_message('/r1/front_laser/scan', LaserScan, timeout=0.5) is not related to velodyne data. It is used to read a lidar data to detect a collision, after that, velodyne data is read through the velodyne callback velodyne_callback(self, v). Then the velodyne data is squashed into 2D, binned and used as a state information.

What is the difference between the meaning of the value of the distance?

I am not quite understanding this question. If you mean what is the value that the sensors return, then for lidar it is distance and for velodyne it is pointcloud coordinate data. In order to obtain the distance information from velodyne data, a euclidean distance is calculated to the coordinate. Then, an angle is calculated at which this distance is obtained. These functions are carried out in the velodyne callback.

If you want to use a real velodyne device, you should just connect it to your robot computer and publish its data on its own publisher. I do not remember if you could set the fov of velodyne. If yes, then you can just set it. If not, then just read all the pointcloud data from velodyne, and filter out the angles that you do not want in the state information. In this case you would just need to slightly modify the velodyne callback.

hjj-666 commented 2 years ago

Do you mean that data = rospy.wait_for_message('/r1/front_laser/scan', LaserScan, timeout=0.5) in this code, did you use other radars to obtain laser data in the simulation, but in my impression the simulation You seem to be using only the velodyne radar. Another question is that when I actually use the velodyne radar, there is a /scan topic that can be used to detect collisions. Do you know what this topic has to do with velodyne's 2D value compressed by the callback function velodyne_callback(self, v) difference?

reiniscimurs commented 2 years ago

If you will look at the rqt graph while running the simulation, you will see the attached devices. Besides the velodyne 3D lidar, there is also the 2D rpLidar attached to the robot in simulation. The /scan topic you mention is for the 2D rpLidar, this is used only to detect collisions. It is done so because it is a very simple way to detect them. Other methods include using bumpers, but those are notoriously difficult to implement in ROS and using velodyne data directly, but at that point you need to make sure you do not make mistakes in filtering out the floor data and so on. That is why the RpLidar is used for collisions as it is a more robust way to detect them.

As for the laser information that is later given to the network, here only the velodyne data is used from the velodyne callback function.

TL;DR: /scan topic is for a separate lidar that only detects collisions, it has nothing to do with velodyne callback and data

reiniscimurs commented 2 years ago

Closing due to inactivity

reiniscimurs / DRL-robot-navigation

Running problems #12

TD3 network

TD3 network

TD3 network