Open donamin opened 7 years ago
I changed update_every value from 25 to 30 to resolve this warning:
Number of agents should divide episodes per update.
But still it doesn't seem to be working.
Weird thing is that sometimes when I run the code, I get the following exception:
Traceback (most recent call last):
File "E:/agents/agents/scripts/train.py", line 165, in <module>
tf.app.run()
File "C:\Python\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "E:/agents/agents/scripts/train.py", line 147, in main
for score in train(config, FLAGS.env_processes):
File "E:/agents/agents/scripts/train.py", line 113, in train
config.num_agents, env_processes)
File "E:\agents\agents\scripts\utility.py", line 72, in define_batch_env
for _ in range(num_agents)]
File "E:\agents\agents\scripts\utility.py", line 72, in <listcomp>
for _ in range(num_agents)]
File "E:\agents\agents\tools\wrappers.py", line 333, in __init__
self._process.start()
File "C:\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python\Python35\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 66, in __init__
reduction.dump(process_obj, to_child)
File "C:\Python\Python35\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'train.<locals>.<lambda>'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "E:\agents\agents\tools\wrappers.py", line 405, in close
self._process.join()
File "C:\Python\Python35\lib\multiprocessing\process.py", line 120, in join
assert self._popen is not None, 'can only join a started process'
AssertionError: can only join a started process
Update: When I change env_processes
to False, it seems to be working! But I guess it disables all the parallelism that this framework is presenting, right?
It could be normal that TensorBoard doesn't show anything for a while. The frequency for writing logs is define inside _define_loop()
in train.py. This is set to twice per epoch where one training epoch is config.update_every * config.max_length
steps and one evaluation epoch is config.eval_episodes * config.max_length
steps. It could be that either your environment is very slow or that an epoch consists of a large number of steps for you.
What environment are you using and how long are episodes typically? Can you post your full config?
I worked on that and it seems there's some other problem with the code: Now it's showing this error:
Traceback (most recent call last):
File "E:/agents/agents/scripts/train.py", line 165, in <module>
tf.app.run()
File "C:\Python\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "E:/agents/agents/scripts/train.py", line 147, in main
for score in train(config, FLAGS.env_processes):
File "E:/agents/agents/scripts/train.py", line 113, in train
config.num_agents, env_processes)
File "E:\agents\agents\scripts\utility.py", line 72, in define_batch_env
for _ in range(num_agents)]
File "E:\agents\agents\scripts\utility.py", line 72, in <listcomp>
for _ in range(num_agents)]
File "E:\agents\agents\tools\wrappers.py", line 333, in __init__
self._process.start()
File "C:\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python\Python35\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 66, in __init__
reduction.dump(process_obj, to_child)
File "C:\Python\Python35\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'train.<locals>.<lambda>'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "E:\agents\agents\tools\wrappers.py", line 405, in close
self._process.join()
File "C:\Python\Python35\lib\multiprocessing\process.py", line 120, in join
assert self._popen is not None, 'can only join a started process'
AssertionError: can only join a started process
If I change env_processes to False, it works! Do you know what's the problem?
Please wrap code blocks in 3 back ticks. Your configuration must be pickable and it looks like yours is not. Try to define it without using lambdas. As alternatives, define external functions, nested functions, or use functools.partial()
. I need to see your configuration to help further.
OK I got an update:
In train.py, I changed this line:
batch_env = utility.define_batch_env(lambda: _create_environment(config), config.num_agents, env_processes)
into this:
batch_env = utility.define_batch_env(_create_environment(config), config.num_agents, env_processes)
Not it doesn't give me that previous error, but now it seems to be freezing after showing this log:
INFO:tensorflow:Start a new run and write summaries and checkpoints to E:\model\20170922-165119-pendulum.
[2017-09-22 16:51:19,149] Making new env: Pendulum-v0
The CPU overload for my python is 0% so it doesn't to be doing anything. Any ideas?
This is my configs:
def default():
"""Default configuration for PPO."""
# General
algorithm = ppo.PPOAlgorithm
num_agents = 10
eval_episodes = 25
use_gpu = False
# Network
network = networks.ForwardGaussianPolicy
weight_summaries = dict(all=r'.*', policy=r'.*/policy/.*', value=r'.*/value/.*')
policy_layers = 200, 100
value_layers = 200, 100
init_mean_factor = 0.05
init_logstd = -1
# Optimization
update_every = 30
policy_optimizer = 'AdamOptimizer'
value_optimizer = 'AdamOptimizer'
update_epochs_policy = 50
update_epochs_value = 50
policy_lr = 1e-4
value_lr = 3e-4
# Losses
discount = 0.985
kl_target = 1e-2
kl_cutoff_factor = 2
kl_cutoff_coef = 1000
kl_init_penalty = 1
return locals()
Where is the env
defined in your config? You should not create the environments in the main process as you did by removing the lambda.
I thought that we give env
as one of the main arguments in command prompt.
So how should I create the environments? You mean I should change the default code structure so I can make the BatchPPO work?
No, I meant you should undo the change you made to the batch env line. You define environments in your config by setting env = ...
to either the name of a registered Gym environment or to a function that returns an env object.
Oh OK I found out what I did wrong with removing the lambda keyword. But how can I solve this using external or nested functions? I did a lot of searching but couldn't figure this out since I'm kind of new to Python. Can you help me with this? How is that it is working on your computer and not on mine? Because not being able to pickle lambda functions seems to be a Python feature, and I already tried Python 3.5 and 3.6.
I've seen it working on many people's computers :)
Please check if YAML is installed:
python3 -c "import ruamel.yaml; print('success')"
And check if the Pendulum environment works:
python3 -c "import gym; e=gym.make('Pendulum-v0'); e.reset(); e.render(); input('success')"
If both works please start from a fresh clone of this repository and report your error message again.
Thanks for your reply.
I tried both tests with success.
I cloned the repository again and the code doesn't work. It's not showing me that lambda error but it stays still when it reaches this line of code in wrappers.py:
self._process.start()
When I use debugging, stepping into start function eventually takes guides me to this line in context.py (The code hangs when it reaches this line):
from .popen_spawn_win32 import Popen
BTW, I'm using Windows 10. Maybe it has something to do with OS?
Yea, that might be the problem. Processing is quite different between Windows and Linux/Mac and we mainly tested on the latter. I'm afraid I can't be of much help since I don't use Windows. Do you have an idea how to debug this? I'd be happy to test and merge a fix if you come up with one.
OK thanks for your reply. I have no idea right now. But I will work on it because it's kind of important for me to make it work on Windows. I'll let you if it's solved. Thanks :)
@donamin Where you able to narrow down this issue?
@danijar No I couldn't solve it so I had to switch to linux. Sorry.
Thanks for getting back. I'll keep this issue open for now. We might support Windows in the future since as far as I can see the threading is the only platform-specific bit. But unfortunately, there are no concrete plans for this at the moment.
It seems you cannot use the _worker class method for multiprocessing.Process on Windows. If you use a global def globalworker( constructor, conn): it will not hang. But then it cannot use getattr. Is there a way to rewrite _worker to be a globalworker?
self._process = multiprocessing.Process(
target=globalworker, args=(constructor, conn))
@erwincoumans Yes, this seems trivial since self._worker()
does not access any object state. You'd just have to replace the occurrences of self
with ExternalProcess
. I'd be happy to accept a patch if this indeed fixes the behavior on Windows. I don't have a way to test on Windows myself.
Hi
I started the learning a few minutes ago and this is what I got in command prompt:
It's been like this for about 10 minutes and tensorboard doesn't show anything. In the log directory, there is only one file called 'config.yaml'. Is it ok? It would be nice to see if the agent is progressing or it is hung or something.
Thanks Amin