eleurent / rl-agents

Implementations of Reinforcement Learning and Planning algorithms
MIT License
553 stars 149 forks source link

AttributeError: 'Evaluation' object has no attribute 'seed'. and if it is feasible to rely solely on reinforcement learning to achieve automatic driving decision-making #98

Closed zh1114haoshen closed 1 year ago

zh1114haoshen commented 1 year ago

Hello Mr. Edouard, first of all, thank you very much for your contribution to the automatic driving in the highway scenario. I am very interested in this part. My current job is to deploy reinforcement learning to the highway scenario, using the intersection_social_dqn.ipynb framework in the highway repository. I just changed the environment to the highway environment and deployed DQN, DDQN, and duelingDDQN without any problems. Snipaste_2023-04-24_16-17-22

However, when I deployed FTQ to the highway scenario, I encountered an attribute missing problem, which prompted: AttributeError: 'Evaluation' object has no attribute 'seed'.

Snipaste_2023-04-24_16-15-37

I don't know how to solve it and I am looking forward to your reply.

In addition, I would like to ask if you have achieved 100% safety purely relying on reinforcement learning. I found in the highway documentation that you use DQN from stable_baselines3 to make decisions, but there are no related performance indicators such as collision rate and average speed.

So I would like to ask if it is feasible to rely solely on reinforcement learning to achieve automatic driving decision-making.

eleurent commented 1 year ago

Hey! I pushed a fix, you can try again and let me know if this solves your issue.

In addition, I would like to ask if you have achieved 100% safety purely relying on reinforcement learning. I found in the highway documentation that you use DQN from stable_baselines3 to make decisions, but there are no related performance indicators such as collision rate and average speed. So I would like to ask if it is feasible to rely solely on reinforcement learning to achieve automatic driving decision-making.

In my own experience, yes it is feasible to achieve "100% safety" (as evidenced by an empirical 0 collision out of some number of episodes, not a proof per se) especially in simplified simulated domains like highway-env. The real world is another beast. And of course, it also depends on a number of factors and design choices: what to use for observations, for actions, the reward function (usually a safety vs efficiency tradeoff is involved here), the network architecture, the high level algorithm, the implementation, etc. I refer to stable_baselines3 in the docs because it is a strictly better library than this one (better maintained, lets of algorithms, wide community), but I mostly experimented with rl-agents. You can look at my PhD publications for more details. Unfortunately I focused mostly on reward maximisation which aggregates safety metrics like collisions vs efficiency metrics like speed, and I did not always provide a table distinguishing the two, which I agree would have been a great idea.

zh1114haoshen commented 1 year ago

Hello, I have updated the latest rl-agents and found that some dependencies are not supported. After the update and rerun, it still cannot work. Firstly, there is no seed in the environment. noseed myseed

After adding a seed, it prompts that there is no seeding. I don't know how to modify it next. noseeding

eleurent commented 1 year ago

Hi, sorry about that, there was indeed another bug, it should be fixed now (the new api is env.reset(seed) and not env.seed()). You can remove the seed() you added, upgrade and try again.