tkn-tub / veins-gym

Reinforcement Learning-based VANET simulations
https://www2.tkn.tu-berlin.de/software/veins-gym/
GNU General Public License v2.0
53 stars 8 forks source link

[question] Why The Default Omnet Run file fails? #10

Closed lionyouko closed 2 years ago

lionyouko commented 2 years ago

Sorry for bothering you again, Mr Bose.

When I was using veinsgym, I changed that run.sh in a way to run my scenario directly, a compiled file in the root of the project. It worked fine there.

But now I created another project with src and simulations (one of the defaults of the omnet). So there exists a file also called run with the following content in simulations dir (it's name is "run" but I will write runX to make easier to understand in the figure below):

#!/bin/sh
cd `dirname $0` ../src/aibasedoffloadings -n .:../src $*
# for shared lib, use: opp_run -l ../src/aibasedoffloadings -n .:../src $*

This is done by omnet. When I copy my veinsgym scenario in the simulations folder, it will have its name, say, rltask and inside of it I put the run.sh and the .ini file (alongside with the .ned files).

root
 | -out
 | - src
 | - simulations
      | - rltask
           | - run.sh (exec ../runX "$@")
 |    | - runX

If I try to use the default run from omnet referencing it in run.sh, the python scripts starts but keeps hanging on the listening part. However everything is alright, as they are the same files that worked before with the rltask when I was running directly from the root in the other open project. So I already know it should work, but if I try to run using the default run script of omnet, it hangs.

I believe you have a reason to have created that particular run you shared along both serpentine and dcc, so maybe you know why this is happening.

I would like also to say that, in this project I am Project Referencing INET and VEINS, they aren't libs in a subdirectory inside the project, but opened and built projects on a different dir. I am saying this because, in both serpentine and dcc, the script "run" Mr Bose wrote considers a lib dir with veins inside, for instance. I am not quite sure how to modify the script to take into account these differences as INET and VEINS are referenced and outside project dir. I tried to use the default run because of this.

(In my other project, run.sh would have exec ../rltask "$@", and rltask-[x86_84/le] compiled file was copied in the root dir of the project, and by that I know it is not in the python script or in the omnet .ned, .cc,.h files I used, as it is working fine).

Either way Mr Bose could help me, I will be glad.
I hope it is something simple.

lionyouko commented 2 years ago

I notice this: When I run in omnet my simulation, the command is: Command line: ../../src/AIBasedOffloadings -m -n ..:../../src:../../../inet/src:../../../inet/examples:../../../inet/tutorials:../../../inet/showcases:../../../resources/veins-master/examples/veins:../../../resources/veins-master/src/veins --image-path=../../../inet/images:../../../resources/veins-master/images -l ../../../inet/src/INET -l ../../../resources/veins-master/src/veins omnetpp.ini In makefile in src, there are:

# Other makefile variables (-K)
INET_PROJ=../../inet
VEINS_PROJ=../../resources/veins-master

So I tried to do the following:

#!/usr/bin/env python3

"""
Runs scenario simulation with veins current directory
"""

import os
import argparse

def relpath(s):
    root = os.path.dirname(os.path.realpath(__file__))
    return os.path.relpath(os.path.join(root, s), ".")

parser = argparse.ArgumentParser("Run a Veins simulation")
parser.add_argument(
    "-d",
    "--debug",
    action="store_true",
    help="Run using opp_run_dbg (instead of opp_run)",
)
parser.add_argument(
    "-t",
    "--tool",
    metavar="TOOL",
    dest="tool",
    choices=["lldb", "gdb", "memcheck"],
    help="Wrap opp_run execution in TOOL (lldb, gdb or memcheck)",
)
parser.add_argument(
    "-v",
    "--verbose",
    action="store_true",
    help="Print command line before executing",
)
parser.add_argument(
    "--", dest="arguments", help="Arguments to pass to opp_run"
)
args, omnet_args = parser.parse_known_args()

print()
print("Args: ", vars(args))
print()
print("Omnet Args: ", omnet_args)
print()

if (len(omnet_args) > 0) and omnet_args[0] == "--":
    omnet_args = omnet_args[1:]

run_libsVEINS = [

    relpath(s) for s in ["../../resources/veins-master"]
]

run_libsINET = [

    relpath(s) for s in ["../../inet"]
]

run_nedsVEINS = [
    relpath(s) for s in [" ./", "../../resources/veins-master"]
]  + ["."]

run_nedsINET = [
    relpath(s) for s in [" ./", "../../inet"]
]  + ["."]

run_imgsVEINS = [relpath(s) for s in ["../../resources/veins-master/images"]]

run_imgsINET = [relpath(s) for s in ["../../inet/images"]]

run = "../src/aibasedoffloadings" if not args.debug else "../src/aibasedoffloadings_dbg"
#run = "./" if not args.debug else "../src/experiment_dbg"

lib_flagsVEINS = ["-l%s" % s for s in run_libsVEINS]

lib_flagsINET = ["-l%s" % s for s in run_libsINET]

ned_flagsVEINS = ["-n" + ";".join(run_nedsVEINS)]

ned_flagsINET = ["-n" + ";".join(run_nedsINET)]

img_flagsVEINS = ["--image-path=" + ";".join(run_imgsVEINS)]

img_flagsINET = ["--image-path=" + ";".join(run_imgsINET)]

prefix = []
if args.tool == "lldb":
    prefix = ["lldb", "--"]
if args.tool == "gdb":
    prefix = ["gdb", "--args "]
if args.tool == "memcheck":
    prefix = [ 
        "valgrind",
        "--tool=memcheck",
        "--leak-check=full",
        "--dsymutil=yes",
        "--log-file=valgrind.out",
    ]

# cmdline = ["sudo"] + prefix + [run]  + lib_flags + ned_flags + img_flags  + omnet_args

cmdline = ["sudo"] + prefix + [run]  + lib_flagsVEINS + lib_flagsINET + ned_flagsVEINS + ned_flagsINET + img_flagsVEINS  + img_flagsINET + omnet_args

print("I go run now!")
print()

if not args.verbose:
    print(
        "Running with command line arguments: %s"
        % " ".join(['"%s"' % arg for arg in cmdline])
    )

os.execvp("env", cmdline)
exit()

But, despite I indeed have a file in src folder called aibasedoffloadings (actually AIBasedOffloadings[86_64/le], when I run, it says:

Args:  {'debug': False, 'tool': None, 'verbose': False, 'arguments': None}

Omnet Args:  ['-uCmdenv', '-cMultipleGymsOne', '--seed-set=0', '--*.manager.seed=0', '--*.gym_connection.port=5551']

I go run now!

Running with command line arguments: "sudo" "../src/aibasedoffloadings" "-l../../../resources/veins-master" "-l../../../inet" "-n../ .;../../../resources/veins-master;." "-n../ .;../../../inet;." "--image-path=../../../resources/veins-master/images" "--image-path=../../../inet/images" "-uCmdenv" "-cMultipleGymsOne" "--seed-set=0" "--*.manager.seed=0" "--*.gym_connection.port=5551"
sudo: ‘../src/aibasedoffloadings’: No such file or directory

It doesn't find. This run script is in simulations, it is the runX one being called by run.sh.

I want to argue that those double libs and images will cause some problem, but it doesn't seems to be the problem now. Despite I have there compiled the aibasedoffloadings, it doesn't find it when I try to run.

Thank you!

lionyouko commented 2 years ago

I am almost being able to suit the file for the situation described. I will post it here for reference for somebody that may need.

lionyouko commented 2 years ago

Unfortunately, it is giving me the following error and I don't know how to solve it:

Action space: Discrete(2)
INFO:root:Closing VeinsEnv.
Traceback (most recent call last):
  File "StableBaselineDQNAgent.py", line 116, in <module>
    main()
  File "StableBaselineDQNAgent.py", line 68, in main
    check_env(env)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/stable_baselines3/common/env_checker.py", line 283, in check_env
    _check_returned_values(env, observation_space, action_space)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/stable_baselines3/common/env_checker.py", line 142, in _check_returned_values
    obs = env.reset()
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 16, in reset
    return self.env.reset(**kwargs)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 242, in reset
    self.close()
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 309, in close
    shutdown_veins(self.veins)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 114, in shutdown_veins
    process.terminate()
  File "/usr/lib/python3.8/subprocess.py", line 1938, in terminate
    self.send_signal(signal.SIGTERM)
  File "/usr/lib/python3.8/subprocess.py", line 1933, in send_signal
    os.kill(self.pid, sig)
PermissionError: [Errno 1] Operation not permitted

In reset function, when it is time to destroy the process, it raises an error. It is raising in this particular simulation that I built again. On the others, it didn't raised that error.

My script is as follows:

#!/usr/bin/env python3

"""
Runs scenario simulation with veins current directory
"""

import os
import argparse
import sys

def relpath(s):
    root = os.path.dirname(os.path.realpath(__file__))
    return os.path.relpath(os.path.join(root, s), ".")

parser = argparse.ArgumentParser("Run a Veins simulation")
parser.add_argument(
    "-d",
    "--debug",
    action="store_true",
    help="Run using opp_run_dbg (instead of opp_run)",
)
parser.add_argument(
    "-t",
    "--tool",
    metavar="TOOL",
    dest="tool",
    choices=["lldb", "gdb", "memcheck"],
    help="Wrap opp_run execution in TOOL (lldb, gdb or memcheck)",
)
parser.add_argument(
    "-v",
    "--verbose",
    action="store_true",
    help="Print command line before executing",
)
parser.add_argument(
    "--", dest="arguments", help="Arguments to pass to opp_run"
)
args, omnet_args = parser.parse_known_args()

print()
print("Args: ", vars(args))
print()
print("Omnet Args: ", omnet_args)
print()

if (len(omnet_args) > 0) and omnet_args[0] == "--":
    omnet_args = omnet_args[1:]

run_libs = [   
    relpath(s) for s in ["../../resources/veins-master/src/veins","../../inet/src/INET"]
]

run_neds = [
    relpath(s) for s in ["./", "../../resources/veins-master/examples/veins", "../../resources/veins-master/src/veins/", "../../inet/examples", "../../inet/tutorials", "../../inet/showcases", "../../inet/src", "../src/"]
]

run_imgs = [relpath(s) for s in ["../../resources/veins-master/images","../../inet/images"]]

run = "../../src/AIBasedOffloadings" if not args.debug else "../../src/aibasedoffloadings_dbg"

lib_flags = ["-l%s" % s for s in run_libs]

ned_flags = ["-n" + ":".join(run_neds)]
img_flags = ["--image-path=" + ";".join(run_imgs)]
print(ned_flags)

prefix = []
if args.tool == "lldb":
    prefix = ["lldb", "--"]
if args.tool == "gdb":
    prefix = ["gdb", "--args "]
if args.tool == "memcheck":
    prefix = [ 
        "valgrind",
        "--tool=memcheck",
        "--leak-check=full",
        "--dsymutil=yes",
        "--log-file=valgrind.out",
    ]

# cmdline = ["sudo"] + prefix + [run]  + lib_flags + ned_flags + img_flags  + omnet_args

cmdline =  prefix +  [run] + lib_flags  +  ned_flags + img_flags + omnet_args

if not args.verbose:
    print(
        "Running with command line arguments: %s"
        % " ".join(['"%s"' % arg for arg in cmdline])
    )

os.execvp("env", ["env"] + cmdline)
exit()

(I changed the version of the code above, I don't know what I was thinking before) An example output is this:

INFO:root:Closing VeinsEnv.
DEBUG:root:Listening on configured port 5551
DEBUG:root:Launching veins experiment using command `['./run.sh', '-uCmdenv', '-cMultipleGymsOne', '--seed-set=0', '--*.manager.seed=0', '--*.gym_connection.port=5551']`
DEBUG:root:Veins process launched with pid 275098
INFO:root:Launched veins experiment, waiting for request.

Args:  {'debug': False, 'tool': None, 'verbose': False, 'arguments': None}

Omnet Args:  ['-uCmdenv', '-cMultipleGymsOne', '--seed-set=0', '--*.manager.seed=0', '--*.gym_connection.port=5551']

/home/parallels/tools_thesis/simulationscenarios/AIBasedOffloadings/simulations/lowerloss

Running with command line arguments: "sudo" "../../src/AIBasedOffloadings" "-l../../../resources/veins-master/src/veins" "-l../../../inet/src/INET" "-n..:../../../resources/veins-master/src/veins:." "-n..:../../../inet:." "--image-path=../../../resources/veins-master/images" "--image-path=../../../inet/images" "-uCmdenv" "-cMultipleGymsOne" "--seed-set=0" "--*.manager.seed=0" "--*.gym_connection.port=5551"
['-l../../../resources/veins-master/src/veins -l../../../inet/src/INET']

OMNeT++ Discrete Event Simulation  (C) 1992-2019 Andras Varga, OpenSim Ltd.
Version: 5.6.2, build: 200518-aa79d0918f, edition: Academic Public License -- NOT FOR COMMERCIAL USE
See the license for distribution terms and warranty disclaimer

Setting up Cmdenv...

Loading NED files from ..:  8
Loading NED files from ../../../resources/veins-master/src/veins:  43
Loading NED files from .:  2

Preparing for running configuration MultipleGymsOne, run #0...
Assigned runID=MultipleGymsOne-0-20220702-00:48:15-275099
Setting up network "Lowloss"...
Initializing...
INFO:root:Received first request from Veins, ready to run.

Observation space: Box([0. 0. 0.], [1000. 1000. 1000.], (3,), float32)
Shape: (3,)
Action space: Discrete(2)
INFO:root:Closing VeinsEnv.
Traceback (most recent call last):
  File "StableBaselineDQNAgent.py", line 116, in <module>
    main()
  File "StableBaselineDQNAgent.py", line 68, in main
    check_env(env)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/stable_baselines3/common/env_checker.py", line 283, in check_env
    _check_returned_values(env, observation_space, action_space)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/stable_baselines3/common/env_checker.py", line 142, in _check_returned_values
    obs = env.reset()
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 16, in reset
    return self.env.reset(**kwargs)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 242, in reset
    self.close()
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 309, in close
    shutdown_veins(self.veins)
  File "/home/parallels/tools_thesis/omnetpp-5.6.2/samples/lowerlossRL1/venv/lib/python3.8/site-packages/veins_gym/__init__.py", line 114, in shutdown_veins
    process.terminate()
  File "/usr/lib/python3.8/subprocess.py", line 1938, in terminate
    self.send_signal(signal.SIGTERM)
  File "/usr/lib/python3.8/subprocess.py", line 1933, in send_signal
    os.kill(self.pid, sig)
PermissionError: [Errno 1] Operation not permitted

Thank you

lionyouko commented 2 years ago

I would like to announce that I may have found the solution of my problem above described.

By reading this link killing sudo started process, I found out that, since our cmdine was

cmdline = ["sudo"] + prefix + [run] + lib_flagsVEINS + lib_flagsINET + ned_flagsVEINS + ned_flagsINET + img_flagsVEINS + img_flagsINET + omnet_args

but I am running via a normal user from (venv), I would not be able to stop the process started by sudo. So I simply cut the ["sudo"] part and it seems to be working now.

If I find more details, I will gladly post here.