Closed andreaBassichUoY closed 6 years ago
Hi @andreaBassich, as you might imagine HFO was not designed with dynamic reloading of parameters in mind. This is because the rcssserver itself is not designed for reloading parameters. However, it may be possible to hack a solution. Can you provide an example to replicate the error you're getting?
In the class soccer_env.py I added a method to try to restart the server
from importlib import reload
def _restartServer(self):
self.env.act(hfo_py.QUIT)
self.env.step()
os.system("killall -9 rcssserver")
reload(hfo_py)
self.env = hfo_py.HFOEnvironment()
self._start_hfo_server()
time.sleep(1)
self.env.connectToServer(hfo_py.HIGH_LEVEL_FEATURE_SET, config_dir=hfo_py.get_config_path())`
The error arises when the last line of the method is executed, which surprises me as I assumed that by killing the server the formations wouldn't be already initialised.
Thanks for your help
I believe the formations are loaded by the agents, rather than the server. So it may be useful to kill the agent's processes as well. This should be taken care of by the cleanup function which should be called when the hfo_environment is garbage collected (https://github.com/LARG/HFO/blob/master/bin/HFO#L21).
It seems pretty weird - sort of like the module is not being properly reloaded. 1) Can you try with just a single offense agent to simplify things? 2) Can you send full stack trace?
I cant seem to get the stack trace for this specific error, I tried with traceback i.e.
try:
self.env.connectToServer(hfo_py.HIGH_LEVEL_FEATURE_SET, config_dir=hfo_py.get_config_path())
except Exception:
traceback.print_exc()
but it doesn't get to the catch clause. By debugging I could follow it as far as line 138 in hfo.py:
hfo_lib.connectToServer(self.obj,
feature_set,
config_dir.encode('utf-8'),
server_port,server_addr.encode('utf-8'),
team_name.encode('utf-8'),
play_goalie,
record_dir.encode('utf-8'))
After which it crashes, I'm guessing this ends up calling line 32 in HFO.cpp. As a side note for testing purposes I was using only one agent.
I found it weird as well, as I thought that by reloading the whole module there wouldn't be any problems. Also you're right, the problem should be on the agent's side, as even if I don't start a new server, I get the same error instead of [ConnectToServer] Server Down!, so it definitely doesn't get to line 63 in HFO.cpp.
Hi @mhauskn, are there any new developments regarding this issue?
Sorry, no updates from my end. I'd be happy to accept a PR if you find a way to add this functionality.
HI @mhauskn, after a bit of debugging I found out that in order to properly restart the server it's necessary to not have the main process initialise/connect to the server, but that part has to be done through another process. This allows for this process to be killed whenever the server is re-started, hence avoiding the error mentioned above. In the end it wasn't anything within the library itself.
Cheers
Great to hear that you found a workaround!
Hi @andreaBassich. Can you please elaborate about how you initiated the server in another process, and what was that another process.
Hi @Amrit-pal-Singh,
Digging through my old code I found the way I initialised the server process, hopefully, this will be helpful with your current issue.
async def _start_server_as(self, get_port=False):
cmd = hfo_py.get_hfo_path()
for k in self._params.keys():
v = self._params[k]
if isinstance(v, bool):
if v:
cmd += ' ' + k
else:
cmd += ' ' + k + '=' + str(v)
successful = False
while not successful:
if get_port:
self._port = self._get_free_port()
cmd1 = cmd + ' --port=' + str(self._port)
self._server_proc = await asyncio.create_subprocess_exec(*cmd1.split(), stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
while True:
try:
line = await asyncio.wait_for(self._server_proc.stdout.readline(), 1)
except:
# The call timed out
self._server_proc.terminate()
for i in range(len(self._players)):
self._send(i, ('_close', []))
self._players = []
break
else:
if not line: # EOF
self._server_proc.terminate()
for i in range(len(self._players)):
self._send(i, ('_close', []))
self._players = []
break
else:
line = str(line)
if 'Waiting for player-controlled agent' in line:
message = line.split()
in_q = Queue()
out_q = Queue()
Player(
feature_set=self._feature_set,
name=message[4][:-1],
config_dir=message[5].split('=')[1][:-1],
server_port=int(message[6].split('=')[1][:-1]),
server_addr=message[7].split('=')[1][:-1],
team_name=message[8].split('=')[1][:-1],
play_goalie=False,
in_q=out_q,
out_q=in_q,
)
self._players.append((out_q, in_q))
self._send(-1, ('_connect_to_server', []))
elif 'Starting game' in line:
successful = True
break
elif 'killall' in line:
self._server_proc.terminate()
for i in range(len(self._players)):
self._send(i, ('_close', []))
self._players = []
break
else:
pass
# For debugging you can print(line)
continue # While some criterium is satisfied
As you can see this was part of a class where the parameters are stored as a dictionary and self._get_free_port() returns the number of a port that is currently not being used.
Cheers,
Andrea
Thank you @andreaBassich for the code. Actually I'm trying to run this code: https://github.com/f-leno/AdHoc_AAMAS-17
They are creating new thread for every agent and then initializing connection to HFO in that thread. As in your code you are terminating the process in this
self._server_proc.terminate()
for i in range(len(self._players)):
self._send(i, ('_close', []))
self._players = []
break
but I guess the thread should stop when the program is terminating, so the problem should not arise. Can you please tell if there is any issue in this approach?
And if you have ever worked on this code in the past, it would be great help!!
In my code I am doing something similar, I put the Player class referenced in the code above at the end of this post if you want to have a look. My code in particular is made so that the server can be re-started with different parameters, that's why I have it set up that way.
I haven't worked on this repo in the past so unfortunately can't give you any tips on that.
class Player:
def __init__(self, feature_set, name, config_dir, server_port, server_addr, team_name, play_goalie, in_q, out_q):
self._feature_set = feature_set
self._name = name
self._config_dir = config_dir
self._server_port = server_port
self._server_addr = server_addr
self._team_name = team_name
self._play_goalie = play_goalie
self._env = hfo.HFOEnvironment()
self._in_q = in_q
self._out_q = out_q
self._done = False
self._thread = threading.Thread(target=self._execute)
self._thread.start()
self._teammate_number = 0
self._can_kick = False
def _execute(self):
while not self._done:
method, args = self._in_q.get()
res = self.__getattribute__(method)(*args)
if res is not None:
self._out_q.put(res)
def _connect_to_server(self):
self._env.connectToServer(
feature_set=self._feature_set,
config_dir=self._config_dir,
server_port=self._server_port,
server_addr=self._server_addr,
team_name=self._team_name,
play_goalie=self._play_goalie,
)
self.status = hfo_py.IN_GAME
def _step(self, action):
action_type = ACTIONS[action]
if action_type == hfo_py.SHOOT or action_type == hfo_py.PASS or action_type == hfo_py.DRIBBLE:
if self._can_kick:
if action_type == hfo_py.PASS:
self._env.act(action_type, self._teammate_number)
else:
self._env.act(action_type)
else:
self._env.act(hfo_py.NOOP)
else:
self._env.act(action_type)
self.status = self._env.step()
return self._getState(), self._get_reward(), self.status
def _getState(self):
state = self._env.getState()
if state[15] != -2:
self._teammate_number = state[15]
self._can_kick = state[5] == 1
return state
def _reset(self):
while self.status == hfo_py.IN_GAME:
self._env.act(hfo_py.NOOP)
self.status = self._env.step()
return self._getState()
def _get_reward(self):
if self.status == hfo_py.GOAL:
return 1
if self.status == hfo_py.CAPTURED_BY_DEFENSE:
return -1
if self.status == hfo_py.OUT_OF_BOUNDS:
return -1
if self.status == hfo_py.OUT_OF_TIME:
return 0
return 0
def _close(self):
self._env.act(hfo_py.QUIT)
self._env.step()
self._done = True
Thank you @andreaBassich. Your code helped to resolve my issue and I was able to run using threads.
Hi @andreaBassichUoY I am wrapping the HFO to a Multi-agents environment( as a GYM kind). I use multi threads to realize multiple agents to connect to the server at the same time. However, the server always down. Could you tell me how can you restart the server? How to " not have the main process initialise/connect to the server "
Hi @mhauskn,
I'm using this environment in Python through the hfo_py library, and I need to dynamically change some parameters, such as number of defenders etc, while staying in the same session. I tried to kill the server process and open a new server with different parameters, however i get the following error:
hfo_py/src/strategy.cpp 198: already initialized. ERROR Failed to read team strategy. Init failed
This happens even when i manually reload the whole hfo_py module. Is there a way to modify the parameters of the environment without re-starting the server? If i were to re-start the server what would I need to do in order to avoid this exception?
Cheers