Stanford-ILIAD / CARLO

2D Driving Simulator
MIT License
102 stars 37 forks source link

'Runner' object has no attribute 'eval' #1

Closed baifanxxx closed 3 years ago

baifanxxx commented 4 years ago

Hi, Thank you for your interersting work. But I have some problems on running your code. When I run "python train_ppo", there are some problems as this:

ile "/home/baifan/anaconda3/envs/pt2/lib/python3.6/site-packages/stable_baselines/common/callbacks.py", line 228, in _on_step return self.callback(self.locals, self.globals) File "train_ppo.py", line 116, in callback rets = evaluate(model, eval_dir) File "train_ppo.py", line 75, in evaluate model.eval() AttributeError: 'Runner' object has no attribute 'eval' In call to configurable 'train' (<function train at 0x7f3bde956bf8>)

I think this problem may need to transfer running avgs from env->eval_env. But I do not know what to do. So Can you help me? Thank you.

Sincerely, BAI Fan

ebiyik commented 4 years ago

It seems the error is thrown by a file called train_ppo.py , which is not in our repository, but in stable baselines. Therefore, I am not sure what the issue is. What is the variable named "model"? Is that an environment or a policy model?

baifanxxx commented 4 years ago

Hi, I run this code from your 'allan' branch, not the master branch. Because I find the code in master branch is so easy, and do not have train and test code. So, is the 'allan' branch the code about your paper? If not, how can I train and test the code about your paper. Thank you, Sincerely

ebiyik commented 4 years ago

Hi again, The "allan" branch is not related to our publication, but must be a different work. This repo (master branch) contains only the simulation environment we have developed and used in the paper. The full training pipeline is not available online, but it should be easy to replicate it. Simply, you can (i) wrap the scenario in example_intersection.py as an OpenAI Gym environment, (ii) collect some data on it (using either the policy we provided, some human-collected data or any other control policy), (iii) train behavioral cloning or CoIL policies using those data, and finally (iv) train the high-level RL policy using OpenAI baselines or stable-baselines.