AIcrowd / neurips2020-procgen-starter-kit

Starter Kit for NeurIPS 2020 - Procgen Competition on AIcrowd
Apache License 2.0
90 stars 43 forks source link

Permission denied (publickey). #14

Open ozamanan opened 4 years ago

ozamanan commented 4 years ago

When I run the command git clone git@github.com:AIcrowd/neurips2020-procgen-starter-kit.git I get an error Warning: Permanently added the RSA host key for IP address '140.82.113.3' to the list of known hosts. git@github.com: Permission denied (publickey). fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists. and when I simply clone the repo and execute the ./run.sh --train command I can an error ray.tune.error.TuneError: Insufficient cluster resources to launch trial: trial requested 7 CPUs, 0.8999999999999999 GPUs but the cluster has only 2 CPUs, 1 GPUs, 1.37 GiB heap, 0.63 GiB objects (1.0 node:192.168.1.168). Passqueue_trials=Truein ray.tune.run() or on the command line to queue trials until the cluster scales up or resources become available.

KarolisRam commented 4 years ago

I had both of these issues, first one I solved by doing: git clone https://github.com/AIcrowd/neurips2020-procgen-starter-kit.git

Second one by changing line 20 of run.sh to: export RAY_CPUS=8

ozamanan commented 4 years ago

A follow up question for that, my system has 16 GB ram, 2 GB swap memory and 6 GB GPU RAM but it seems to be insufficient when I execute ./run.sh --train, is it necessary to allocate so much memory or will it work with lesser allocation?

jyotishp commented 4 years ago

Hello @ozamanan, as @KarolisRam mentioned you can change these values to suit your hardware availability.

You can try playing with these parameters as well to get it to run locally. For the impala-baseline.yaml provided in the starter kit, during the training phase with the default configuration, the trainer takes around 9GB of GPU memory and the rollout workers take around 900 MB each. The experiment takes around 18 GB of RAM.

You can also try playing with the procgen-starter-example.yaml which requires far fewer resources. Make sure to update the experiment variable accordingly.