eugenevinitsky / sequential_social_dilemma_games

Repo for reproduction of sequential social dilemmas
MIT License
380 stars 134 forks source link

Trouble getting train_baseline to work #180

Open maunhb opened 4 years ago

maunhb commented 4 years ago

Hi, I have managed to install and get the tests working, but the train_baseline gives errors when run. I have tried updating the ray version but this caused other problems. At the moment I'm using ray 0.6.1 and have added a symlink to experimental as this seemed to be required.

Exception in thread ray_listen_error_messages: Traceback (most recent call last): File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/worker.py", line 1818, in listen_error_messages_raylet error_messages = global_state.error_messages(worker.task_driver_id) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/experimental/state.py", line 897, in error_messages assert isinstance(job_id, ray.DriverID) AttributeError: module 'ray' has no attribute 'DriverID'

Commencing experiment cleanup_A3C Did not find checkpoint file in /home/charlotte/ray_results/cleanup_A3C. Starting a new experiment. Traceback (most recent call last): File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 295, in _update_avail_resources resources = ray.global_state.cluster_resources() File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/experimental/state.py", line 767, in cluster_resources clients = self.client_table() File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/experimental/state.py", line 404, in client_table return parse_client_table(self.redis_client) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/experimental/state.py", line 29, in parse_client_table NIL_CLIENT_ID = ray.ObjectID.nil().binary() AttributeError: type object 'common.ObjectID' has no attribute 'nil'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train_baseline.py", line 183, in tf.app.run(main) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train_baseline.py", line 177, in main "config": config, File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/tune/tune.py", line 164, in run_experiments trial_executor=trial_executor) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 95, in init RayTrialExecutor(queue_trials=queue_trials) File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 44, in init self._update_avail_resources() File "/home/charlotte/anaconda3/envs/causal/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 301, in _update_avail_resources None, None, None) TypeError: check_and_update_resources() takes 1 positional argument but 3 were given

When I updated to the current ray version, the function get_global_worker wasn't recognized as this only exists in Natasha's version of ray in worker.py but I don't understand how the symlinks work enough to get one which directs to this file rather than a folder. Do you know what the mistake might be? Thanks

Charlotte

internetcoffeephone commented 4 years ago

Currently, @eugenevinitsky is in the process of merging in my fork, which is several ray versions ahead. Until it's merged in, you can use that instead. It contains many bugfixes, and should work out of the box.