tkn-tub / veins-gym

Reinforcement Learning-based VANET simulations
https://www2.tkn.tu-berlin.de/software/veins-gym/
GNU General Public License v2.0
53 stars 8 forks source link

step function I __init__ #3

Closed lionyouko closed 2 years ago

lionyouko commented 2 years ago

One would like to start separated process directly from omnet++ and use with veins gym. That is possible, however, at line 224 when an episode is done or when done is true on general , the code tries to wait() a process that does not exist if you never launched one from RL agent. Program crashes. So, that call could be changed to:

if self.veins:
    self.veins.wait()

Changed to that in my source code and it worked as intended with no apparent side effect.

lionyouko commented 2 years ago

More info:

For some reason that I don't know why, the run script that we need to run omnet process from veins gym was repeated being called for me. And despite the process was said to be running and showing the pid, the simulation apparently didn't run as nothing was being shown and nothing was communicating with the agent.

On the other hand, launching them separately was working fine. So I decided to invest in that one and found that it was crashing at line 224.

I still don't know why it is calling run file without stop and repeatedly. The last lines keeps being print indefinitely as it seems the script run is being called multiple times and I still wasn't able to track why.

Captura de ecrã 2022-04-16, às 22 29 42
lionyouko commented 2 years ago

I found how to launch veins from RL agent, veins gym, but zmq gives-me random error in the socket. By random I mean that it can finish the entire episodes without any error or raises an error between episodes:

In the image below, the error happens between first and second episode. If I use omnet++ to run the simulation, such error doesn't seems to happen.

line 250 mentioned is also in the left side, it is when reset makes new bind for a new socket after closing the old one. Seems like a race condition, is it? Do you have this problem? Because if I put, for example, time.sleep(0.059) in trivialRL at the end of each episode, it works fine. Thank you again for your patience.

IMAGE:

Captura de ecrã 2022-04-17, às 01 32 31

TrivialRL.py:

import time
import logging
import os
import random

import gym
import veins_gym
from veins_gym import veinsgym_pb2

def serialize_action(actions):

    reply = veinsgym_pb2.Reply()
    if not hasattr(actions, '__iter__'):
        actions = [actions]
    reply.action.box.values.extend(actions)

    return reply.SerializeToString()

gym.register(
    id="veins-v1",
    entry_point="veins_gym:VeinsEnv",
    kwargs={
        "scenario_dir": os.path.relpath(
            os.path.join(
                os.path.dirname(os.path.abspath(__file__)), "..", "lowerloss"
            )
        ),

        "timeout": 120.0,
        "print_veins_stdout": True,
        "action_serializer": serialize_action,
        "run_veins": False,  # do not start veins through Veins-Gym
        "port": 5555,  # pick a port to use
        # to run in a GUI, use:
        "user_interface": "Cmdenv",
        # "timeout": -1,
    },
)

def main():
    logging.basicConfig(level=logging.DEBUG)

    env = gym.make("veins-v1")
    logging.info("Env created")

    # env.reset()
    # logging.info("Env reset")
    done = False
    # fixed_action = [random.randint(0, 1)]
    episodes = 5
    rewards = []

    for episode in range(0, episodes):
        env.reset()
        logging.info("Env reset for episode %s", episode)
        observation, reward, done, info = env.step(random.randint(0, 1))

        while not done:

            r_action = random.randint(0, 1)
            observation, reward, done, info = env.step(r_action)
            rewards.append(reward)
            # 1604022> the last action sent by omnetpp to veinsgym will be a shutdown
            # and that will make the done set to true. In the veinsgym code that this happens, it will
            # also generates a random of the observation (_parse request function)
            # that will be sent in step function
            # and must then be discarded (as it is an undesired step + 1 anyway)
            if not done:
                logging.debug(
                    "Last action: %s, Reward: %.3f, Observation: %s",
                    r_action,
                    reward,
                    observation,
                )
        print("Number of steps taken:", len(rewards))
        print("Mean reward:", sum(rewards) / len(rewards))
        rewards = []
        time.sleep(0.059)
dbuse commented 2 years ago

Hi @lionyouko

First, thanks for the suggestion around the self.veins.wait() call. I've included an if-condition and pushed a new version.

dbuse commented 2 years ago

For the second issue, "Address already in use" (and your observation that a small sleep fixes the issue) indicates the classical address reuse problem.

In a non-ZMQ situation, I would simply add SO_REUSEADDR to the socket options to fix this. But AFAIK, that is not possible with ZMQ.

Instead, I assume that the underlying TCP socket is not yet closed due to lingering messages see this.

So you can try to run self.socket.close(linger=0) and see if that fixes your issue. If so, I'll include that in the upstream code. So far I could not reproduce the issue on my systems. It should be safe to set linger=0, i.e., drop all unset messages, as veins is potentially gone by this point anyway.

lionyouko commented 2 years ago

Thank you for the response. While searching a solution for that, I also found about lingering in zmq documentation (and because it isn't set up, it is "infinite" time), but I couldn't go further in investigation to the point to debug and see if some last messages were lingering. For that, I took this "conservative" approach to give the program some time to finish their businesses before unbinding from the socket as I couldn't predict for sure stuff about the remaining messages.

I will test it and later on I come back to tell if it works (I totally believe it will, but maybe it will be "too much" to force linger = 0).

Thanks again. As a consequence, I finished my example-scenario in omnet++, and if you would like to take a look, I can upload it and you can use it here to be one more example. I would like to say, however, it is super simple, and I made it only to see the tool working.

It consists in four nodes, one called agent sender, two called loss, being one loss30 and the other loss70, and one called Acker, in which loss30 loses 30% of the messages, and loss70 70%. The Acker will tell the agent sender that it received the message, thus, making the agent sender to know it would be rewarded with +1 with the successfully delivery of the message, while 0 when otherwise. However, agent in python side is very simple, it just compute the mean reward of each episode in the end of the episode, and its action is to send a random number between 0 or 1 to the agent sender in omnetpp, meaning to choose loss30 or loss70. Thing here is that I didn't need to use TraceCI or Veins, but tried to extract only the essential of the examples (dcc and serpentine) to make it work with the bare minimum.

dbuse commented 2 years ago

Hi @lionyouko, thanks for the update. I hope you succeed with your further investigation.

I would really like to link your code as another example for VeinsGym. I think it is actually beneficial that it is such a simple example. Other newcomers exploring the tool may find it easier to follow than something we wrote after having designed VeinsGym ourselves.

serapergun commented 2 years ago

I am getting this error for some reason I don't know;

terminate called after throwing an instance of 'zmq::error_t' what(): Socket operation on non-socket

The communication between simulator was fine before. I got the mean reward. But I could not start again.

What should I do?

lionyouko commented 2 years ago

I am getting this error for some reason I don't know;

terminate called after throwing an instance of 'zmq::error_t' what(): Socket operation on non-socket

The communication between simulator was fine before. I got the mean reward. But I could not start again.

What should I do?

Try to change your action it a vector of actions like this: `def serialize_action(actions):

reply = veinsgym_pb2.Reply()
if not hasattr(actions, '__iter__'):
    actions = [actions]
reply.action.box.values.extend(actions)

return reply.SerializeToString()`

Try this and tell me if it solved your issue. I remember having this problem, but it was a bit ago and I don't remember the details.

dbuse commented 2 years ago

I think we can consider this (original) issue closed.

@serapergun : please do not duplicate issues without need. We'll continue with your problem here: https://github.com/tkn-tub/serpentine-env/issues/7

serapergun commented 2 years ago

Dear lionyouko, Thanks for your help. I revised the code that you write. The calculation part works but I got an error as below:

ZMQError Traceback (most recent call last) Input In [5], in <cell line: 22>() 20 print("Mean reward:", sum(rewards) / len(rewards)) 21 print("Info:",info) ---> 22 env.step(0)

File ~/anaconda3/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py:11, in OrderEnforcing.step(self, action) 9 def step(self, action): 10 assert self._has_reset, "Cannot call env.step() before calling reset()" ---> 11 observation, reward, done, info = self.env.step(action) 12 return observation, reward, done, info

File ~/anaconda3/lib/python3.8/site-packages/veins_gym/init.py:217, in VeinsEnv.step(self, action) 213 def step(self, action): 214 """ 215 Run one timestep of the environment's dynamics. 216 """ --> 217 self.socket.send(self._action_serializer(action)) 218 step_result = self._parse_request(self._recv_request()) 219 if step_result.done:

File ~/anaconda3/lib/python3.8/site-packages/zmq/sugar/socket.py:547, in Socket.send(self, data, flags, copy, track, routing_id, group) 540 data = zmq.Frame( 541 data, 542 track=track, 543 copy=copy or None, 544 copy_threshold=self.copy_threshold, 545 ) 546 data.group = group --> 547 return super(Socket, self).send(data, flags=flags, copy=copy, track=track)

File zmq/backend/cython/socket.pyx:718, in zmq.backend.cython.socket.Socket.send()

File zmq/backend/cython/socket.pyx:765, in zmq.backend.cython.socket.Socket.send()

File zmq/backend/cython/socket.pyx:247, in zmq.backend.cython.socket._send_copy()

File zmq/backend/cython/socket.pyx:242, in zmq.backend.cython.socket._send_copy()

File ~/anaconda3/lib/python3.8/site-packages/zmq/backend/cython/checkrc.pxd:28, in zmq.backend.cython.checkrc._check_rc()

ZMQError: Operation cannot be accomplished in current state

lionyouko commented 2 years ago

@serapergun You may want to ask dbuse if he has prepared a forum or something so we can put such questions. But, as far as I read the error, the issue lies in here : assert self._has_reset, "Cannot call env.step() before calling reset()" so please try to have a look if you are indeed resetting your env (env.reset()) in the beginning and when an episode ends if you are using episodes. Also, you are using python notebooks, right? I didn't use them, I don't know how much it can affect. But I can tell you this: don't do like me. Some of the errors I got, I didn't take note, now I would have to deal with them again! Take notes every time you find solution.

You can see me resetting the env just like dbuse, because it is how you need to use openAI framework: `

def main(): logging.basicConfig(level=logging.DEBUG)

env = gym.make("veins-v1")
logging.info("Env created")
# HERE RESET
# env.reset()
# logging.info("Env reset")
done = False
# fixed_action = [random.randint(0, 1)]
episodes = 5
rewards = []

for episode in range(0, episodes):
   # HERE RESET
    env.reset()
    logging.info("Env reset for episode %s", episode)
    observation, reward, done, info = env.step(random.randint(0, 1))

    while not done:

        r_action = random.randint(0, 1)
        observation, reward, done, info = env.step(r_action)
        rewards.append(reward)
        # 1604022> the last action sent by omnetpp to veinsgym will be a shutdown
        # and that will make the done set to true. In the veinsgym code that this happens, it will
        # also generates a random of the observation (_parse request function)
        # that will be sent in step function
        # and must then be discarded (as it is an undesired step + 1 anyway)
        if not done:
            logging.debug(
                "Last action: %s, Reward: %.3f, Observation: %s",
                r_action,
                reward,
                observation,
            )
    print("Number of steps taken:", len(rewards))
    print("Mean reward:", sum(rewards) / len(rewards))
    rewards = []
    time.sleep(0.059)

`

Despite the first one being commented, it was used before. Since I am using episodes now, it makes sense to reset in the beginning of the episode.

serapergun commented 2 years ago

@lionyouko Thank you so much. Such a good and detailed explain. You're expert on it. I will try immediately.

serapergun commented 2 years ago

Thank you for the response. While searching a solution for that, I also found about lingering in zmq documentation (and because it isn't set up, it is "infinite" time), but I couldn't go further in investigation to the point to debug and see if some last messages were lingering. For that, I took this "conservative" approach to give the program some time to finish their businesses before unbinding from the socket as I couldn't predict for sure stuff about the remaining messages.

I will test it and later on I come back to tell if it works (I totally believe it will, but maybe it will be "too much" to force linger = 0).

Thanks again. As a consequence, I finished my example-scenario in omnet++, and if you would like to take a look, I can upload it and you can use it here to be one more example. I would like to say, however, it is super simple, and I made it only to see the tool working.

It consists in four nodes, one called agent sender, two called loss, being one loss30 and the other loss70, and one called Acker, in which loss30 loses 30% of the messages, and loss70 70%. The Acker will tell the agent sender that it received the message, thus, making the agent sender to know it would be rewarded with +1 with the successfully delivery of the message, while 0 when otherwise. However, agent in python side is very simple, it just compute the mean reward of each episode in the end of the episode, and its action is to send a random number between 0 or 1 to the agent sender in omnetpp, meaning to choose loss30 or loss70. Thing here is that I didn't need to use TraceCI or Veins, but tried to extract only the essential of the examples (dcc and serpentine) to make it work with the bare minimum.

@lionyouko Is it possible to share your project that you mentioned? I really want to improve myself with veins-gym. And still looking for the codes/project which runs correctly.

lionyouko commented 2 years ago

@serapergun I have some issues with my GitHub account that I need to address first. I will try to find solution and then I may ask Doctor dbuse to upload my example. Have in mind that Dr dbuse will need to have a look on my project, even tho I know it works, because it will be an example project open to everybody, and he must be assured it is on point with his needs. So it may take longer for you to see it here. While this doesn't happen, I again can't recommend enough for you to read the veinsgym codebase, so you start to understand how the entire communication looks like between agent and simulation. You may want to learn what ZMQ and Protocol Buffers are about too. They are convenient tools Dr dbuse used, so you may want to learn about them a bit as the veingym middleware is under development and you are willing to use it. Optionally, when I finish to mitigate the issue in my account, I can upload it open into my profile just informally.

serapergun commented 2 years ago

@lionyouko I understand you, totally you are right of course in this situation. Thank you so much for all your useful comments.