Making Tasks Infinite-Horizon

Farama-Foundation / HighwayEnv

A minimalist environment for decision-making in autonomous driving

https://highway-env.farama.org/

MIT License

2.64k stars 756 forks source link

Making Tasks Infinite-Horizon #237

Closed nileshop22 closed 2 years ago

nileshop22 commented 3 years ago

Hi, My question was how can we make the environments Infinite Horizon.

Reply from @eleurent:

I guess there are two ways to make an environment infinite-horizon:

make the road network closed, like the racing track environment

make vehicles spawn in some vicinity of the controlled vehicle, and disappear when they are far, like in the intersection environment (although it is finite horizon)

Further, in the Highway env, I removed the duration constrain here, and the environment resets only after collision. Though it is a Infinite-Horizon, but I am not sure if it's a good thing to try since after collision I will have to reset the environment.

Racetrack is also a good option to try out but here once again after collision we will have to reset the environment.

I have few other question:

Are there any other ways to make these tasks Infinite Horizon? Or any idea to make a completely new Infinite Horizon Tasks.
How to implement discrete action space for Racetrack?

Thanks! Looking forward to your reply @eleurent.

eleurent commented 3 years ago

Are there any other ways to make these tasks Infinite Horizon? Or any idea to make a completely new Infinite Horizon Tasks.

If you want to forbid any kind of reset, including with another vehicle, then it means that

either the controlled vehicle has to be the single vehicle in the scene, with no other obstacle. This limits learning tasks to learning to follow a lane, to make maneuvers, to reach a desired location etc.
either there are other vehicles, but some kind of safe policy is monitoring the actions of the controlled vehicle, and blocking them when they would lead to a collision. However, no such policy exist for now, and it is not a trivial thing to implement. It may be achieved with model predictive control (i.e. simulate the desired action, and if there is a collision, rewind and choose another safer action instead), but that may be a bit costly computationally.

I'm not sure what other options you are thinking of...

How to implement discrete action space for Racetrack?

See the docs: https://highway-env.readthedocs.io/en/latest/actions/index.html

nileshop22 commented 3 years ago

Thanks! I wanted to know how to implement discrete meta-actions for Racetrack, to me it seems it uses continuous actions.

eleurent commented 3 years ago

You have to do the following:

env = gym.make("racetrack-v0")
env.configure({
    "action": {
        "type": "DiscreteMetaAction"
    }
})
env.reset()

nileshop22 commented 3 years ago

@eleurent Just a quick question, are all the environments here are stochastic? I wonder that it is the case though I haven't myself gone through the code. Will you please confirm. Thanks

eleurent commented 3 years ago

No, most environments are deterministic, at least by default. The initial state however, is sampled typically from some distribution, which brings some randomness to the agent's experiences.

eleurent commented 3 years ago

Actually, in some environments the behavioural parameters of vehicles in the scene are also randomly sampled at initialization. Since these are not observed directly but affect the transitions of these vehicles, this can be seen as a form of stochasticity (although better described as partial observability).

In most cases however, these behavioural parameters are simply initialized to a default value for every vehicle, and then everything is properly deterministic.

nileshop22 commented 3 years ago

Hi, thanks a lot for the detailed and quick response! I'm thinking of adding some form of stochasticity, either by adding some friction or by any other means (maybe some real world stochasticity 🤔). What are your views on this? Also, how should I proceed to implement it. Thanks again.

eleurent commented 3 years ago

For friction:

the simplest is to add a linear (or quadratic) fluid friction force F = - \alpha.v to the dynamics
alternatively, you could switch the vehicle model to the DynamicsModel, which is a finer dynamical model which simulates tire friction and slip. However, note that this requires increasing the simulation frequency otherwise it will get unstable.

Other sources of stochasticity:

for other vehicles: their destination, their driving styles
for every vehicle: you may directly add random noise (gaussian, or e.g. Ornstein-Uhlenbeck) to the dynamics, to model some unknown disturbances

eleurent commented 3 years ago

For instance, you could modify this line from the vehicle kinematics:

https://github.com/eleurent/highway-env/blob/764902f2d7675d2616f062b4c05a8f6618c3d03e/highway_env/vehicle/kinematics.py#L128

and add a random disturbance term, such as a linear friction term with a random coefficient:

       alpha = self.road.np_random.uniform(low=0.2, high=0.3)
        self.speed += self.action['acceleration'] * dt - alpha * self.speed * dt