reiniscimurs / GDAE

A goal-driven autonomous exploration through deep reinforcement learning (ICRA 2022) system that combines reactive and planned robot navigation in unknown environments
114 stars 15 forks source link

problem with GDAM.py #15

Open bit-lsj opened 11 months ago

bit-lsj commented 11 months ago

Hi, thank you for making such great work open source. But I do have some questions: def test(env, actor):

while True:
    action = [0.0, 0.0]
    s2, toGoal = env.step(action)
    s = np.append(s2, toGoal)
    s = np.append(s, action)

    while True:

        a = actor.predict([s])
        aIn = a
        aIn[0,0] = (aIn[0,0]+1)/4
        s2, toGoal = env.step(aIn[0])
        s = np.append(s2, a[0])
        s = np.append(s, toGoal)`

In the GDAM.py, line 40-41: s = np.append(s2, toGoal) s = np.append(s, action) the state of the input to the network is combine of " Laser ranges + Goal( dis and theta) + Action(linear and angular) ", the states are combined in the same order as you set in DRL-navigation. However, In the GDAM.py, line 49-50: s = np.append(s2, a[0]) s = np.append(s, toGoal) the order of combinations of states seems to become " Laser ranges + Action(linear and angular) + Goal( dis and theta) " , this order is different from the order of states in the previous code in line 40-41. Is there some mistake in my understanding? Thank you very much for taking the time to answer my questions!

reiniscimurs commented 11 months ago

Hi,

DRL-navigation is a separate repo from GDAE and is not an exact training method for training a network for this implementation, but a general DRL navigation policy training method. GDAE is a specific repo for a specific implementation of a physical robot, which means there is some engineering in the code that would not be required in a generic solution. In order to use the DRL-navigation policy here, you would have to update the GDAE code so that it aligns with it. One of these alignments is the order of the state that you pointed out. Others might be more subtle and dependent on your use case. See a bit of the discussion here: https://github.com/reiniscimurs/GDAE/issues/14

bit-lsj commented 11 months ago

Thank you for your reply. I understand what you mean, GDAM is an independent warehouse from DRL navigation. But in your GDAM.py file, as I pointed out above, In the GDAM.py, line 40-41:

s = np.append(s2, toGoal)
s = np.append(s, action)

and line 49-50:

s = np.append(s2, a[0])
s = np.append(s, toGoal)

the order of the states you entered before and after seems to be different, which is the main reason for my confusion. Thank you again for your reply!

reiniscimurs commented 11 months ago

Feel free to change this around in GDAE to fit what is in the training.

bit-lsj commented 11 months ago

Okay, thank you for your reply.