Open Holt59 opened 5 years ago
Hi @Holt59
One possibility is that for some reason the port is taken for that environment. We start with port 5005 (worker_id=0) and increment from there. I would suggest trying different worker ids.
If that doesn't seem to be the issue, another thing to try would be to add a wait time between launching the environment. We've gotten reports that when launching too many Unity processes concurrently errors like this can occur.
@Holt59 you may be running out of GPU memory. I've only been able to run 2x16 locally (16 per gpu one is 1080 with 8gb other is a 1060 with 6gb). In the large scale curiosity paper they stated they where only able to get 40 unity environments running (I can't remember if it was a 4 or 8 gpu)
Also, I use a sec delay between launching each unity instance
@awjuliani I've already checked the port, I'll try to add a delay between launch.
@Sohojoe I've a 12G K80 and I am only starting environment, no extra algorithms. And as I said, the GPU memory consumption (nvidia-smi
) is nowhere near the its limit. I'll check the delay between the launch.
@Holt59 - did you get around this? I found that some ports are in use on my PC and so did a hardcoded hack to skip them
@Sohojoe — I did not solve this but I did not look that much into it because I faced other ones... I checked the ports on my computer, and I had nothing running on these, so I don't think that was the issue.
I tried to launch 100 environments I got a
UnityTimeoutException
when creating the 66th one. I checked multiple times and the exception always occurs on the 66th instantiation.I am using gcloud with a K80 GPU and the memory usage is less than the available memory.