tkn-tub / veins-gym

Reinforcement Learning-based VANET simulations
https://www2.tkn.tu-berlin.de/software/veins-gym/
GNU General Public License v2.0
53 stars 8 forks source link

Shutdown method #2

Closed Anas-1998 closed 2 years ago

Anas-1998 commented 2 years ago

Hello, Firstly, Thanks for this great work! Secondly, from what I have seen during my study to veins-gym, is that the shutdown command will always be associated with zero reward and a random sample from the observation space.is I would like to suggest if you can add a reward that can be sent from the simulation with the terminal state, since alot of environments have some reward for reaching the terminal state. I over come this issue by the following code:

if (accident == true){
            reward_last = -15;
            std::array<double, 6> observation_last= {0, 0, 0, 0, 0, 0};
                const auto request = serializeObservation(observation_last, reward_last);
                auto response = gymCon->communicate(request);
                veinsgym::proto::Request request2;
            request2.set_id(1);
            *(request2.mutable_shutdown()) = {};
            auto response2 = gymCon->communicate(request2);
            }

My problem now in some cases, it send the first request, but then it gets idle in the notebook without sending the shutdown, and I cant figure out why, maybe you have any idea? In the picture below, you can see the output from my environment, the reward = -15 is sent, but then it doesnt shutdown! Screenshot from 2022-03-28 03-47-44 Thanks for your help!

dbuse commented 2 years ago

Hi @Anas-1998,

I see that providing a final reward would make sense in the last step (while an observation probably does not).

This could be worked around by the Veins side simply ignoring the last action and replying with a shutdown message once it considers the simulation done. Conceptually, it is the Veins side that decides when a simulation is done and thus initiates a clean shutdown. The Python/Agent side can also trigger something similar through a reset, but that would rather abort the Veins simulation (and potentially start a new one) rather than a clean shutdown. At least in the current state of implementation.

A more clean approach of what you ask for could be done by adding a (semantically optional) Space reward = 1 field to the message Shutdown definition. Then veins_gym could check if that value exists and return it to the agent, or continue to return 0 if there is none. If you would like to implement this, please do so and open a pull request. Then I'll take a look into it.

Anas-1998 commented 2 years ago

Much thanks, If I have time Ill go for it.