Closed vkotronis closed 5 years ago
I have located the issue with parallelism in line https://github.com/openai/spinningup/blob/master/spinup/utils/test_policy.py#L20 It seems that sth is going wrong with the restoration process (e.g., some kind of locking), since the session tries to load a new agent, but sees the tf variables from a previously loaded session for some reason. See this error:
Process Process-1:
Traceback (most recent call last):
File "/home/vkotronis/Desktop/git_projects/DRL/venv_new/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/vkotronis/Desktop/git_projects/DRL/venv_new/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/vkotronis/Desktop/git_projects/DRL/venv_new/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128,36] rhs shape= [128,1296]
[[{{node save_2/Assign_19}}]]
The non-matching tensors belong to different agents that should actually be loaded in parallel. I will check in detail what the restoration function does exactly.
Following this a bit further it seems that sth is not fork-safe in this line in the code: https://github.com/openai/spinningup/blob/master/spinup/utils/logx.py#L57
I am using separate processes and sessions of course, but maybe there is sth that is dedicated to a single session at a time.
if you have any ideas in the meanwhile ping here please.
Solved this by fully parallelizing the entire testing script using subprocess instead of concurrent.futures. Closing the issue.
Hello, I have a custom environment, on which I have trained DRL agents using the PPO algorithm. I have saved snapshots of several agents (i.e., trained policies) every epochs, and now I want to test these agents to find out which performs the best in the wild. Note that I have trained multiple agents both in space (e.g., same environment but with different tuning parameters) and time (training epochs). I am trying to test them concurrently in Python, but it seems that there is some kind of an issue when I am trying to run the loaded policies in parallel; e.g., with 2 agents and 2 saved epochs per agent, there are issues after the first epoch is tested (see also code later in this comment).
The exact issue is:
Could it be that some kind of tensorflow lock on the session prevents more than one testing sessions from being used at the same time? I have studied the load and run policy function from spinup and no such issue seems to be present in the code, but maybe I am missing sth.
My code (I have removed the non-relevant parts), is the following:
Could you help with this please? If possible, simply try to test in parallel (e.g., using distinct processes) with such a code two different agents. Note that I am not doing anything fancy with tensorflow in the backend, I am simply using the spinup utils and APIs (by the way, several thanks for this repository and the corresponding algorithmic offering!!!)