cathywu / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Other
1 stars 0 forks source link

rllab + EC2 setup #2

Closed cathywu closed 7 years ago

cathywu commented 7 years ago

Issue for tracking rllab + EC2 setup.

Testing the following:

python3 examples/cluster_demo.py

Issue: many nested .mujoco directories (EDIT: RESOLVED) Reason: My mujoco setup is symlinked:

~/.mujoco -> /Users/cathywu/Dropbox/DotFiles/.mujoco
mjpro131 -> /Users/cathywu/Dropbox/mjpro131

Issue: Value () for parameter groupId is invalid. The value cannot be empty

************************************************************
{'DryRun': False,
 'InstanceCount': 1,
 'LaunchSpecification': {'EbsOptimized': True,
                         'IamInstanceProfile': {'Name': 'rllab'},
                         'ImageId': 'ami-67c5d00d',
                         'InstanceType': 'c4.2xlarge',
                         'KeyName': 'research_virginia',
                         'NetworkInterfaces': [],
                         'SecurityGroupIds': [],
                         'SecurityGroups': ['rllab'],
...
Traceback (most recent call last):
  File "examples/cluster_demo.py", line 50, in <module>
    variant=dict(step_size=step_size, seed=seed)
  File "/Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/rllab/misc/instrument.py", line 546, in run_experiment_lite
    periodic_sync_interval=periodic_sync_interval)
  File "/Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/rllab/misc/instrument.py", line 1003, in launch_ec2
    response = ec2.request_spot_instances(**spot_args)
  File "/Users/cathywu/anaconda/envs/rllabcathywu/lib/python3.5/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/cathywu/anaconda/envs/rllabcathywu/lib/python3.5/site-packages/botocore/client.py", line 544, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the RequestSpotInstances operation: Value () for parameter groupId is invalid. The value cannot be empty
cathywu commented 7 years ago

Resolved the symlink issue by deleting a weird nested symlink

rm ~/.mujoco/.mujoco
cathywu commented 7 years ago

Repeating step 5, now that the symlink issue is resolved (though it seems unrelated):

(rllabcathywu) cathywu:~/Dropbox/PhD/DeepRL-Traffic/rllabcathywu$ python3 scripts/setup_ec2_for_rllab.py
Creating S3 bucket at s3://cathywu
S3 bucket created
There is an existing role named rllab. Proceed to delete everything rllab-related and recreate? [y/N] y
Listing instance profiles...
Removing role rllab from instance profile rllab
Deleting instance profile rllab
Deleting inline policy CreateTags
Deleting inline policy TerminateInstances
Detaching policy arn:aws:iam::aws:policy/AmazonS3FullAccess
Detaching policy arn:aws:iam::aws:policy/ResourceGroupsandTagEditorFullAccess
Deleting role
Creating role rllab
Attaching policies
Creating inline policies
Creating instance profile rllab
Adding role rllab to instance profile rllab
Setting up region us-east-1
Creating security group in VPC vpc-e9c1c58f
Security group created with id sg-93c688ec
Trying to create key pair with name rllab-us-east-1
Key pair with name rllab-us-east-1 exists. Proceed to delete and recreate? [y/N] y
Deleting existing key pair with name rllab-us-east-1
Recreating key pair with name rllab-us-east-1
Saving keypair file
Identity added: /Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-east-1.pem (/Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-east-1.pem)
Setting up region us-west-1
Creating security group in VPC vpc-94ff49f0
Security group created with id sg-826a10e5
Trying to create key pair with name rllab-us-west-1
Key pair with name rllab-us-west-1 exists. Proceed to delete and recreate? [y/N] y
Deleting existing key pair with name rllab-us-west-1
Recreating key pair with name rllab-us-west-1
Saving keypair file
Identity added: /Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-west-1.pem (/Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-west-1.pem)
Setting up region us-west-2
Creating security group in VPC vpc-df5174b8
Security group created with id sg-1fac7164
Trying to create key pair with name rllab-us-west-2
Key pair with name rllab-us-west-2 exists. Proceed to delete and recreate? [y/N] y
Deleting existing key pair with name rllab-us-west-2
Recreating key pair with name rllab-us-west-2
Saving keypair file
Identity added: /Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-west-2.pem (/Users/cathywu/Dropbox/PhD/DeepRL-Traffic/rllabcathywu/private/key_pairs/rllab-us-west-2.pem)
Writing config file...
rllab/config_personal.py exists. Override? [y/N]
cathywu commented 7 years ago

Same error (running python3 examples/cluster_demo.py > error.txt 2>&1).

Error log: error.txt

cathywu commented 7 years ago

Resolved by rerunning (thanks @dementrock)

python3 scripts/setup_ec2_for_rllab.py

and overwriting rllab/config_personal.py with an automatically generated one.