Train new policy from scratch

mit-acl / gym-collision-avoidance

MIT License

242 stars 74 forks source link

Train new policy from scratch #21

Open EttoreCaputo opened 4 months ago

EttoreCaputo commented 4 months ago

Hello @mfe7 , In these days I am trying to train a new RL policy, following the documentation. I am not able to execute a train because I don't know what I need to use to execute the following line:

rl_action = model.sample(obs)

How should I replace model? There is an example file that I can follow to achieve this goal? Please, I need help because I am working on a course project at my university.

mfe7 commented 4 months ago

Hi - you could use the current policy as model and the method your policy has to predict an action given an observation for sample (it may be more common to name this method predict).

here's an example of an RL training script. the snippet in the documentation you linked would be a way to instantiate the environment, but you could otherwise train the policy using the more standard RL pipeline shown in the stable baselines link.

EttoreCaputo commented 4 months ago

Thank you for your answer @mfe7, but I'm still having trouble. I would to start to train the CADRLPolicy and then change something and retrain, but I'm not able to performe a training. I tried to use CADRLPolicy.value_net.train_neural_network() method but I didn't have the right data, for this reason I tried to run the pedData_processing_multi.py file but I need a file that isn't available in the repo: _4_agents_cadrlraw.p, where can I find it? Or is there a file that can generate it?

mfe7 commented 4 months ago

The CADRL training scripts haven't been maintained since ~2017 and the CADRL code is just meant to be available for inference. We do have a separate repo for training policies using GA3C-CADRL https://github.com/mit-acl/rl_collision_avoidance in case that is helpful