allenai / allenact

An open source framework for research in Embodied-AI from AI2.
https://www.allenact.org
Other
308 stars 49 forks source link

Cannot experiment because my internet is slow (timeout) #373

Closed nbqu closed 9 months ago

nbqu commented 1 year ago

Problem / Question

I am trying to train rearrangement task, and I cannot even start training because downloading a controller (thor-Linux64-xxx.zip) until it reaches the 300 seconds of timeout. Then can I complete download it before training? I'm not sure whether this is the case, I'm pretty sure of it because my download speed is like 500KiB/s or something. Maybe there should be enhancement, like the timeout watch does not go until this download is done.

Additional context

I'm running experiment on headless server, and I'm on docker environment, guided on ai2thor-rearrangement. In python interpreter, by running from ai2thor.controller import Controller; c = Controller(), loading a controller itself works fine.

jordis-ai2 commented 1 year ago

Hi @nbqu,

Apologies for the delayed response, but, if I understand it correctly, you should be able to run a python script similar to

from ai2thor.platform import CloudRendering
from ai2thor.controller import Controller

c = Controller(commit_id="a9ccb07faf771377c9ff1615bfe7e0ad01968663", gpu_device=0, platform=CloudRendering)
c.stop()

ahead of training. Let us know if this unlocks you!

Lucaweihs commented 1 year ago

Hi @nbqu,

In case what @jordis-ai2 wrote doesn't work for you, you can also change the server_start_timeout parameter of the Controller from its default value of 300 seconds to something much larger:

from ai2thor.platform import CloudRendering
from ai2thor.controller import Controller

c = Controller(server_start_timeout=30000, commit_id="a9ccb07faf771377c9ff1615bfe7e0ad01968663", gpu_device=0, platform=CloudRendering)
c.stop()
nbqu commented 1 year ago

Thank you for your response. I found the file is downloaded once at ~/.ai2thor so I can resue it when it is completed. So I ran python -c "from ai2thor.controller import Controller; Controller(commit_id='...')" before experiment and it helped me. Closing this issue. Thank you all for your help!

nbqu commented 1 year ago

Hello, I am reopening this issue as I found alternative way to resolve this issue. I manually downloaded the file via aws-cli, with command aws s3 --no-sign-request s3://ai2-thor-public/builds/(commit id) and I found that it is much faster than downloading via downloader script in this project. This issue might have only happened to me, but providing another option would be considerable for someone having trouble like me.