eleurent / rl-agents

Implementations of Reinforcement Learning and Planning algorithms
MIT License
591 stars 153 forks source link

Hi ! #34

Closed CFXgoon closed 4 years ago

CFXgoon commented 4 years ago

Hello!I have a problem in the ego_attention.json,The problem is when use the ego_attention.json to train the agent in the env_obs_attention, the error is happend: [ERROR] Preferred device cuda:best unavailable, switching to default cpu INFO: Creating monitor directory out/HighwayEnv/DQNAgent/run_20200408-221242_4475 Traceback (most recent call last): File "experiments.py", line 148, in main() File "experiments.py", line 43, in main evaluate(opts[''], opts[''], opts) File "experiments.py", line 75, in evaluate display_rewards=not options['--no-display']) File "/home/cfxgg/rl-agents-master/rl_agents/trainer/evaluation.py", line 82, in init self.agent.set_writer(self.writer) File "/home/cfxgg/rl-agents-master/rl_agents/agents/deep_q_network/pytorch.py", line 98, in set_writer dtype=torch.float, device=self.device)) File "/home/cfxgg/conda/envs/test/lib/python3.7/site-packages/tensorboardX/writer.py", line 804, in add_graph self._get_file_writer().add_graph(graph(model, input_to_model, verbose, profile_with_cuda, kwargs)) File "/home/cfxgg/conda/envs/test/lib/python3.7/site-packages/tensorboardX/pytorch_graph.py", line 344, in graph result = model(args) File "/home/cfxgg/conda/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/home/cfxgg/rl-agents-master/rl_agents/agents/common/models.py", line 284, in forward ego_embeddedatt, = self.forward_attention(x) File "/home/cfxgg/rl-agents-master/rl_agents/agents/common/models.py", line 296, in forward_attention ego, others, mask = self.split_input(x) File "/home/cfxgg/rl-agents-master/rl_agents/agents/common/models.py", line 289, in split_input ego = x[:, 0:1, :] IndexError: too many indices for tensor of dimension 2 How can I solve the problem ? I look forward to your reply !

eleurent commented 4 years ago

I could not reproduce the issue using python3 experiments.py evaluate configs/HighwayEnv/env_obs_attention.json configs/HighwayEnv/agents/DQNAgent/ego_attention.json --train --episodes=1500

The error message states that the input tensor x (the observation) was of dimension 2 (rather than 3). Here, the input tensor is a fake observation used for dumping the computation graph of the neural network to tensorboard, generated here: https://github.com/eleurent/rl-agents/blob/master/rl_agents/agents/deep_q_network/pytorch.py#L97

The env.observation_space is supposed to be dimension two (vehicles x features), and a supplementary dimension (batch) is added L97 which should make it three. Can you maybe add a print statement of self.env.observation_space.shape in the set_writer() function ?

CFXgoon commented 4 years ago

I could not reproduce the issue using python3 experiments.py evaluate configs/HighwayEnv/env_obs_attention.json configs/HighwayEnv/agents/DQNAgent/ego_attention.json --train --episodes=1500

The error message states that the input tensor x (the observation) was of dimension 2 (rather than 3). Here, the input tensor is a fake observation used for dumping the computation graph of the neural network to tensorboard, generated here: https://github.com/eleurent/rl-agents/blob/master/rl_agents/agents/deep_q_network/pytorch.py#L97

The env.observation_space is supposed to be dimension two (vehicles x features), and a supplementary dimension (batch) is added L97 which should make it three. Can you maybe add a print statement of self.env.observation_space.shape in the set_writer() function ?

Hi, I add a print statement of self.env.observation_space.shape in the set_writer() function that the shape is (15,7), how can i do next ? Thanks ! ! !

eleurent commented 4 years ago

Okay the observation shape is fine... Can you please add a print statement for the variable x just before the exception?

CFXgoon commented 4 years ago

Okay the observation shape is fine... Can you please add a print statement for the variable x just before the exception?

Hi, thanks ! ! After I print the x in the model.py line 289 , the x is: tensor([[[0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.]]]) tensor([[[0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.]]]) tensor([[[0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.]]]) tensor([[0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0.]]) How can i do ? Thanks a lot ! !

eleurent commented 4 years ago

Okay so the fourth tensor is wrong (dimension 2 instead of 3), which should not happen.

When the exception is raised, can you check the stack trace to see where this observation x was created? If it is in the set_writer function of DQNAgent, can you put the input_to_model parameter value (torch.zeros((1, *self.env.observation_space.shape), dtype=torch.float, device=self.device))) into a local variable and print it?

I'm sorry that this is a bit tedious, as I said I cannot reproduce the bug so it's a bit hard to investigate.

CFXgoon commented 4 years ago

Okay so the fourth tensor is wrong (dimension 2 instead of 3), which should not happen.

When the exception is raised, can you check the stack trace to see where this observation x was created? If it is in the set_writer function of DQNAgent, can you put the input_to_model parameter value (torch.zeros((1, *self.env.observation_space.shape), dtype=torch.float, device=self.device))) into a local variable and print it?

I'm sorry that this is a bit tedious, as I said I cannot reproduce the bug so it's a bit hard to investigate.

Hi, when i print the (torch.zeros((1, *self.env.observation_space.shape), dtype=torch.float, device=self.device)) in the set_writer function, it is: tensor([[[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]]) Is this right ? I have a another question is that is i use the single DQN to train, which file should i use? I see that the the baseline is match the 3DQN and the no_dueling is match the DDQN. I'm very sorry to disturb you again...... thanks!!

eleurent commented 4 years ago

Yes this tensor is right, it is not causing the bug. In your previous post 5 tensors where printed, i'm interested in seeing where the last one (of dimension 2) is generated.

I have a another question is that is i use the single DQN to train, which file should i use? I see that the the baseline is match the 3DQN and the no_dueling is match the DDQN. I'm very sorry to disturb you again...... thanks!!

Don't apologize! Right now the Double DQN is always activated and not configurable. I'll add a config.

CFXgoon commented 4 years ago

of dimension 2

Hi, thanks a lot , I change the "class KinematicObservation" which in the observation.py, i put the vehicles_count from 5 to 7,maybe this behavior cause the bug ? Thanks a lot that you will add a single DQN config ! ! ! !

eleurent commented 4 years ago

Hi, thanks a lot , I change the "class KinematicObservation" which in the observation.py, i put the vehicles_count from 5 to 7,maybe this behavior cause the bug ?

No, it should not cause a bug. Can you print every observation that you get before the exception is raised? One of them is probably ill-formed.

I added separate configs for dqn, ddqn and dueling ddqn in eb265278982d3c0da876e1ea3c75c7476d4ee2fc

CFXgoon commented 4 years ago

Hi, thanks a lot , I change the "class KinematicObservation" which in the observation.py, i put the vehicles_count from 5 to 7,maybe this behavior cause the bug ?

No, it should not cause a bug. Can you print every observation that you get before the exception is raised? One of them is probably ill-formed.

I added separate configs for dqn, ddqn and dueling ddqn in eb26527

Hi, First, thanks a lot ! You means if use the no_dueling.json to tran and put the "double = Flase", now this us the single DQN. If the "double = True" is the DDQN. And if use the file which named baseline it's the 3DQN.Is this right? The second is when I print the "df" in the "class KinematicObservation" which in the observation.py, the terminal show: presence x y vx vy 0 1 15.172513 0.00 0.416667 0.0 1 1 15.286101 0.50 0.402182 0.0 2 1 15.407803 0.75 0.398097 0.0 3 1 15.526467 0.50 0.400257 0.0 4 1 15.656908 0.50 0.411373 0.0 5 1 15.777432 0.50 0.411182 0.0 6 1 15.904327 0.75 0.396529 0.0 And presence x y vx vy cos_h sin_h 0 1 15.669557 0.00 0.416667 0.0 1.0 0.0 1 1 0.115755 0.75 -0.023241 0.0 1.0 0.0 2 1 0.250318 0.75 -0.019270 0.0 1.0 0.0 3 1 0.379993 0.00 -0.012500 0.0 1.0 0.0 4 1 0.495839 0.00 -0.017570 0.0 1.0 0.0 5 1 0.611608 0.00 -0.019833 0.0 1.0 0.0 6 1 0.742235 0.00 -0.001224 0.0 1.0 0.0 7 1 0.863308 0.50 -0.030648 0.0 1.0 0.0 8 1 0.984648 0.75 -0.003740 0.0 1.0 0.0 9 1 1.123828 0.75 -0.005147 0.0 1.0 0.0 And then the the program ends. Is this right? Thank you very much for your help ! ! !

eleurent commented 4 years ago

This is fine, the observations seem alright. Can you add also again the print of x inside split_input(before the crash), and show me the entire stdout output including the stack trace ? (not only the print statement)

CFXgoon commented 4 years ago

This is fine, the observations seem alright. Can you add also again the print of x inside split_input(before the crash), and show me the entire stdout output including the stack trace ? (not only the print statement)

Hi, that i put a statement “logging.exception” in the split_input, the output in the terminal is same to the before... i will try again and feedback to you ! Another question is if I want to test the model which I train before, which command should I run ? I use the command: python3 experiments.py evaluate configs/HighwayEnv/env_medium.json out/HighwayEnv/DQNAgent/saved_models/latest.tar --test --episodes=10 But it failed...

eleurent commented 4 years ago

Another question is if I want to test the model which I train before, which command should I run ?

You should simply use

python3 experiments.py evaluate configs/HighwayEnv/env_medium.json \
                                configs/HighwayEnv/agents/path/to/your/agent_config.json
                                --test --episodes=10

By default, it will look for a saved model at out/HighwayEnv/<agent class>/saved_models/latest.tar

If you want to provide an other model than the default (latest save), you can do so by adding the option --recover-from=out/HighwayEnv/path/to/your/saved_model.tar

CFXgoon commented 4 years ago

You should simply use

python3 experiments.py evaluate configs/HighwayEnv/env_medium.json \
                                configs/HighwayEnv/agents/path/to/your/agent_config.json
                                --test --episodes=10

By default, it will look for a saved model at out/HighwayEnv/<agent class>/saved_models/latest.tar

If you want to provide an other model than the default (latest save), you can do so by adding the option --recover-from=out/HighwayEnv/path/to/your/saved_model.tar

Hi, l have a question is that does the "configs/HighwayEnv/agents/path/to/your/agent_config.json" should match the model which I train before ? There is a situation is that if i use the DQN train before,and then if i want to test the model, I use the ddqn.json to test...

eleurent commented 4 years ago

Yes, the agent config should match the model which you trained. This is because the agent is first created, with its model architecture, from config. And then only the model weights are loaded from the save file.

If you don't remember what config you used, it is always saved in the run directory in a metadata.json file (in the agent field) for reproducibility.

CFXgoon commented 4 years ago

Yes, the agent config should match the model which you trained. This is because the agent is first created, with its model architecture, from config. And then only the model weights are loaded from the save file.

If you don't remember what config you used, it is always saved in the run directory in a metadata.json file (in the agent field) for reproducibility.

Thanks ! That in the testing, if i want to change the environment which different with the train,can i change the env.medium.json? Such as change the environment from env.medium.json to env.json.Thanks ! ! !

eleurent commented 4 years ago

Sure, you can change the environment, provided that the observation and action spaces still match of course.

In particular, be careful with the observation normalization: I think by default I normalize the y position according to the number of lanes (0 is min lane, 1 is max lane). So if you train with 3 lanes and test with 4 lanes, the normalization will change which won't be suitable for the trained policy. Thus, if you want to do so, you should provide the same fixed value for the "feature_range" of the observation in both environment configs.

CFXgoon commented 4 years ago

Sure, you can change the environment, provided that the observation and action spaces still match of course.

In particular, be careful with the observation normalization: I think by default I normalize the y position according to the number of lanes (0 is min lane, 1 is max lane). So if you train with 3 lanes and test with 4 lanes, the normalization will change which won't be suitable for the trained policy. Thus, if you want to do so, you should provide the same fixed value for the "feature_range" of the observation in both environment configs.

Thanks ! ! So if i change the vehicle number of the environment and the vehicle speed and so on,and not change the line number,this will not change the observation normalization,Is right ?

eleurent commented 4 years ago

Yep, in that case you should be fine! :)

CFXgoon commented 4 years ago

Yep, in that case you should be fine! :)

Thanks a lot ! ! ! When I look in the saved_models file,there is only one file named latest.tar, is this the file will be replaced if i train a new model ?

eleurent commented 4 years ago

Yes it will. However, if you erase a previously trained model by mistake, you can always find a copy in the corresponding run directory.

For instance you can do

experiments env.json agent_config_1.json --train
experiments env.json agent_config_2.json --train

and then

experiments env.json agent_config_1.json --test --recover-from=out/agent/run_config_1/checkpoint-final.tar
experiments env.json agent_config_2.json --test --recover-from=out/agent/run_config_2/checkpoint-final.tar

and you can also use checkpoint-best.tar (model with best average perfomance on a moving window) instead of checkpoint-final.tar (model at end of training).

CFXgoon commented 4 years ago

Yes it will. However, if you erase a previously trained model by mistake, you can always find a copy in the corresponding run directory.

For instance you can do

experiments env.json agent_config_1.json --train
experiments env.json agent_config_2.json --train

and then

experiments env.json agent_config_1.json --test --recover-from=out/agent/run_config_1/checkpoint-final.tar
experiments env.json agent_config_2.json --test --recover-from=out/agent/run_config_2/checkpoint-final.tar

and you can also use checkpoint-best.tar (model with best average perfomance on a moving window) instead of checkpoint-final.tar (model at end of training).

Thanks a lot ! ! !

CFXgoon commented 4 years ago

Yes it will. However, if you erase a previously trained model by mistake, you can always find a copy in the corresponding run directory.

For instance you can do

experiments env.json agent_config_1.json --train
experiments env.json agent_config_2.json --train

and then

experiments env.json agent_config_1.json --test --recover-from=out/agent/run_config_1/checkpoint-final.tar
experiments env.json agent_config_2.json --test --recover-from=out/agent/run_config_2/checkpoint-final.tar

and you can also use checkpoint-best.tar (model with best average perfomance on a moving window) instead of checkpoint-final.tar (model at end of training).

Hi, I see that in the dqn.json, the "layers" is [256,256], so the Neural Networks is include a input layer(if the observation of the vehicle is 5) which have a 25 neurons, and the Neural Networks have two hidden layer and each layer have 256 neurons, and a output layer which have a 5 neurons(match the action space: keep,left,right,keep and fast,keep and slow), is this right?

eleurent commented 4 years ago

Yes, you only get to provide the size of hidden layers, and the input and output layers have the shape of the observation and the size of the number of actions, respectively. So a config of [256, 256] corresponds to [n_obs, 256, 256, n_actions]

CFXgoon commented 4 years ago

Yes, you only get to provide the size of hidden layers, and the input and output layers have the shape of the observation and the size of the number of actions, respectively. So a config of [256, 256] corresponds to [n_obs, 256, 256, n_actions]

Ok, thanks a lot ! ! !

CFXgoon commented 4 years ago

Yes, you only get to provide the size of hidden layers, and the input and output layers have the shape of the observation and the size of the number of actions, respectively. So a config of [256, 256] corresponds to [n_obs, 256, 256, n_actions]

Hi,i see in the every metadataXXX.json, there is a "show_trajectories" config, how can i change the config ? Another is when i see the observation.py which in the highway_env/envs/common,and in the class KinematicObservation,when i put the "absolute" equal to True or False, the observation of the ego vehicle "x" is both about two thousand, what is the number means ? Is this means the absolute longitudinal position of the ego vehicle ? And does the ego vehicle is controlled by the IDM and MOBIL which in the longitudinal and lateral control ? And does the lane change trajectory of the ego vehicle keeps constantly in every lane change processing? Thanks ! ! !

eleurent commented 4 years ago

You can change the show_trajectories config directly in the environment config.

x is not a longitudinal position, but a position in the West-East direction (which is not longitudinal for vertical roads). When absolute is False, the origin is the ego-vehicle position. The scale of x and y should be chosen explicitly using feature_range.

The ego-vehicle is not controlled by IDM nor MOBIL, but by the actions provided (change lane, faster, slower). I did not understand the last question.

CFXgoon commented 4 years ago

You can change the show_trajectories config directly in the environment config.

x is not a longitudinal position, but a position in the West-East direction (which is not longitudinal for vertical roads). When absolute is False, the origin is the ego-vehicle position. The scale of x and y should be chosen explicitly using feature_range.

The ego-vehicle is not controlled by IDM nor MOBIL, but by the actions provided (change lane, faster, slower). I did not understand the last question.

Hi, The "And does the lane change trajectory of the ego vehicle keeps constantly in every lane change processing?" qusetion is that if the ego vehicle do a action such as turn to right to avoid make a collision with environment vehicle, and in the next time, the ego vehicle do a action such as turn to left to avoid make a collision with environment vehicle, so the question is does the two processing of the lane change trajectories is the same(such as the smoothness pf the trajectories)?And is there a difference between lane change trajectories when using different algorithms(such as DQN and DDQN)? Thanks ! !

eleurent commented 4 years ago

Both trajectories (left and right lane changes) are identical, implemented with the same state-feedback controller. The environment dynamics are independent of the agent interacting with it.

CFXgoon commented 4 years ago

Both trajectories (left and right lane changes) are identical, implemented with the same state-feedback controller. The environment dynamics are independent of the agent interacting with it. Thanks,that i have question that in the 3DQN Dueling algorithms,it’s have the Value Function and Advantage Function,is it generate the Value Function and Advantage Function using the same network parameters? Thanks!!

eleurent commented 4 years ago

Yes, you can check out the implementation here: https://github.com/eleurent/rl-agents/blob/master/rl_agents/agents/common/models.py#L78

Right now the advantage and value heads are just linear projections, it might be better to add at least one hidden layer

CFXgoon commented 4 years ago

hea

Hi, I'm very sorry that i don‘t know where can I add a hidden layer....

CFXgoon commented 4 years ago

Yes, you can check out the implementation here: https://github.com/eleurent/rl-agents/blob/master/rl_agents/agents/common/models.py#L78

Right now the advantage and value heads are just linear projections, it might be better to add at least one hidden layer

Hi, if i want to add a hidden layer,i will put some code in the model.py in forward function which in class DuelingNetwork, is this right?

eleurent commented 4 years ago

Absolutely you need to replace self.advantage and self.value by MultiLayerPerceptrons. You can do so in the same way as self.base_module, using a config and model_factory.

CFXgoon commented 4 years ago

Absolutely you need to replace self.advantage and self.value by MultiLayerPerceptrons. You can do so in the same way as self.base_module, using a config and model_factory.

OK, thanks a lot, The best wishes for you in your future!

eleurent commented 4 years ago

Thanks, you too! If you struggle adding the hidden layer do not hesitate to ask me and I'll implement it.

CFXgoon commented 4 years ago

Thanks, you too! If you struggle adding the hidden layer do not hesitate to ask me and I'll implement it.

Thanks ! I will try it and feedback the problem which i meet, thanks a lot !

CFXgoon commented 4 years ago

Thanks, you too! If you struggle adding the hidden layer do not hesitate to ask me and I'll implement it.

Hi, that i meet a error when i add a hidden layer, can you give me some help? Thanks ! ! !

eleurent commented 4 years ago

Which error?

CFXgoon commented 4 years ago

Which error? f That‘s about the Number of neurons that the front and back layers is not match... so i wonder that i put the hidden layer in the wrong place...

CFXgoon commented 4 years ago

Which error?

Hi, I have a quetion that in the action space which in the vehicle/control.py, if the agent choose the "faster",and what is the acceleration does the agent take?I see that there is two class about the action, there are class ControlledVehicle and class MDPVehicle. And what is the initial speed of the agent when in every episode? Thanks!!!

eleurent commented 4 years ago

I added the support for hidden layers in advantage and value heads of Dueling architecture.

Regarding actions:

CFXgoon commented 4 years ago

I added the support for hidden layers in advantage and value heads of Dueling architecture.

Regarding actions:

Hi, I calculate the average speed of the agent in one episode,it is floating point such as 25.25,so is this the system error?In my opinion in fact if the acceleration is equal to 5 m/s², the average speed of the agent is integer,is this right ? Thanks, I wish you success in your study!

eleurent commented 4 years ago

I added the support for hidden layers in advantage and value heads of Dueling architecture. Regarding actions:

Hi, I calculate the average speed of the agent in one episode,it is floating point such as 25.25,so is this the system error?In my opinion in fact if the acceleration is equal to 5 m/s², the average speed of the agent is integer,is this right ? Thanks, I wish you success in your study!

No, an average speed of 25.25 m/s can be obtainted by e.g. 19 time steps at 25 m/s and 1 time step at 30m/s. And note that only the target velocity is a discrete integer, the true vehicle velocity is continuous.

CFXgoon commented 4 years ago

No, an average speed of 25.25 m/s can be obtainted by e.g. 19 time steps at 25 m/s and 1 time step at 30m/s. And note that only the target velocity is a discrete integer, the true vehicle velocity is continuous.

Hi, the meaning of the target velocity is the agent velocity.is this right ? And I think i don't understand the meaning of the true vehicle...by the way,does the velocity of the environment vehicle is continous? Thanks! ! !

eleurent commented 4 years ago

the meaning of the target velocity is the agent velocity.is this right ? No the target velocity is the desired velocity. It can have discontinuities when the agent suddenly decides to drive at e.g. 30m/s instead of 25m/s. In contrast, the true velocity of the simulated vehicle is always continuous.

I think i don't understand the meaning of the true vehicle... I meant the true (vehicle velocity), as in the true velocity (as opposed to the target velocity).

CFXgoon commented 4 years ago

the meaning of the target velocity is the agent velocity.is this right ? No the target velocity is the desired velocity. It can have discontinuities when the agent suddenly decides to drive at e.g. 30m/s instead of 25m/s. In contrast, the true velocity of the simulated vehicle is always continuous.

I think i don't understand the meaning of the true vehicle... I meant the true (vehicle velocity), as in the true velocity (as opposed to the target velocity).

so can i think that the target velocity has three possible values is 20m/s,25m/s and 30m/s, and the true velocity is the agent vehicle when drive on the road in every second,and the true velocity is continuous with t(time).is this right? So the average velocity in one episode is the distance divide by the time,is this right?

eleurent commented 4 years ago

Yes, that's right.

CFXgoon commented 4 years ago

Yes, that's right.

Thanks!!That how does the agent(ego vehicle) planning a lane change trajectory when it‘s need to change the lane?In other words, how does it achieve horizontal and vertical control, and where is it embodied? Thank you very much!!!I'm sorry that I bother you a lot...

eleurent commented 4 years ago

@CFXgoon for questions about highway-env, you can add issues on the original repo https://github.com/eleurent/highway-env/issues

The longitudinal and lateral control is achieved here: https://github.com/eleurent/highway-env/blob/master/highway_env/vehicle/controller.py#L10

The steering_control() method (https://github.com/eleurent/highway-env/blob/master/highway_env/vehicle/controller.py#L112) chooses a steering command to track a target_lane, while the velocity_control chooses an acceleration command to reach the desired target_velocity: https://github.com/eleurent/highway-env/blob/master/highway_env/vehicle/controller.py#L142