vita-epfl / CrowdNav

[ICRA19] Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning
MIT License
585 stars 169 forks source link

train question #31

Open anlanxuan opened 3 years ago

anlanxuan commented 3 years ago

Excuse me,can you tell me how to train the agent in linear policy? and when I modified the train parameter,after the imitating learing is done, when run the command explorer.run_k_episodes(env.case_size['val'], 'val', episode=episode), the action = self.robot.act(ob) ob, reward, done, info = self.env.step(action) show the action is None (in vx = human.vx - action.vx), can you give me some advice?

ChanganVR commented 3 years ago

Are you referring to the linear policy that runs in a straight line and are you training a policy to micmic the moving forward only behavior?

It's hard to say what causes this problem since you might have made some changes and I also haven't looked at the code for a while. What I would suggest is to troubleshoot this by using the original code first and adding back changes you made step by step so that you could locate which change caused this bug.

anlanxuan commented 3 years ago

thanks. if I train the agent by imitating policy(orca),and train the value network by drl policy(cadrl or sarl), whether the agent have the capability to cope with human with linear policy?and the max num is 10?

ChanganVR commented 3 years ago

whether the agent have the capability to cope with human with linear policy I think so. Humans with linear policy should be a relative straight forward case to deal with

and the max num is 10? The max number of humans? I remember it can be set to any numbers you like.