allenai / ai2thor

An open-source platform for Visual AI.
http://ai2thor.allenai.org
Apache License 2.0
1.13k stars 215 forks source link

Unity crashing on Colab #937

Open 1996srikesh opened 2 years ago

1996srikesh commented 2 years ago

Hi @ekolve ,

Thank you so much for promptly assisting me with my previous tickets, I am actually experiencing a Unity crash after running about 10 episodes.

In my training code, I observed the following error:

INFO:teach.simulators.simulator_THOR:Finished initializing simulator. Total time: 0.07271265983581543 sec task_dict["task_id"] 112 tasks [<teach.dataset.task_THOR.Task_THOR object at 0x7f1d1ebc5710>] INFO:teach.replay.episode_replay:Starting episode... INFO:teach.simulators.simulator_THOR:In simulator_THOR.start_new_episode, world = FloorPlan3_physics world_type = None INFO:teach.simulators.simulator_THOR:In SimulatorTHOR.start_new_episode, before launch_simulator init_params {'base_dir': '/root/.ai2thor/', 'scene': 'FloorPlan3_physics', 'gridSize': 0.25, 'snapToGrid': True, 'visibilityDistance': 1.5, 'width': 900, 'height': 900, 'agentCount': 1, 'server_class': <class 'ai2thor.wsgi_server.WsgiServer'>, 'platform': <class 'ai2thor.platform.CloudRendering'>} INFO:teach.simulators.simulator_THOR:In SimulatorTHOR.__launch_simulator, creating ai2thor controller (unity process) DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): s3-us-west-2.amazonaws.com:80 DEBUG:urllib3.connectionpool:http://s3-us-west-2.amazonaws.com:80 "HEAD /ai2-thor-public/builds/thor-CloudRendering-2f8dd9f95e4016db60155a0cc18b834a6339c8e1.zip HTTP/1.1" 403 0 DEBUG:ai2thor.build:/root/.ai2thor/releases/thor-CloudRendering-54535f6b9d76896c2ccb4532727aeda5741a9061/thor-CloudRendering-54535f6b9d76896c2ccb4532727aeda5741a9061 exists - skipping download Traceback (most recent call last): File "train_reinforce.py", line 55, in agent.train() File "/content/Project/blocks/BlockWorldRoboticAgent/agent.py", line 302, in train self.learning_alg.train(self.sess, self.train_writer) File "/content/Project/blocks/BlockWorldRoboticAgent/learning/policy_gradient.py", line 195, in train er._set_up_new_episode(None,False,None) File "/content/Project/blocks/BlockWorldRoboticAgent/teach/replay/episode_replay.py", line 370, in _set_up_new_episode commander_embodied=True if self.episode.commander_embodied == "True" else False, File "/content/Project/blocks/BlockWorldRoboticAgent/teach/simulators/simulator_THOR.py", line 424, in start_new_episode self.launch_simulator(world=world, world_type=world_type) File "/content/Project/blocks/BlockWorldRoboticAgent/teach/simulators/simulator_THOR.py", line 2350, in __launch_simulator self.controller = TEAChController(init_params) File "/content/Project/blocks/BlockWorldRoboticAgent/teach/simulators/simulator_THOR.py", line 45, in init super().init(kwargs) File "/usr/local/lib/python3.7/dist-packages/ai2thor/controller.py", line 498, in init host=host, File "/usr/local/lib/python3.7/dist-packages/ai2thor/controller.py", line 1292, in start self._start_unity_thread(env, width, height, unity_params, image_name) File "/usr/local/lib/python3.7/dist-packages/ai2thor/controller.py", line 1020, in _start_unity_thread raise Exception(message) Exception: Unity process has exited - check Player.log for errors. Confirm that Vulkan is properly configured on this system using vulkaninfo from the vulkan-utils package. returncode=-11 [ ]

It's a little difficult for me to replicate the problem in a smaller file, unfortunately. I consulted: https://github.com/allenai/ai2thor/issues/729.

and tried catching the exception and calling reset() from my application, but that does not alter the error.

ekolve commented 2 years ago

Could you share your training code, a requirements.txt (listing any dependencies), Python version and cuda version? I can try to reproduce the issue on my side.

1996srikesh commented 2 years ago

I am using Python3 and Colab GPU CUDA version cuda_11.1.TC455_06.29190527_0. The training code is in policy_gradient.py I run it by calling !python3 train_reinforce.py. Can you give me a link to share my tarball of files to you?

ekolve commented 2 years ago

You can email the code to erick@allenai.org

1996srikesh commented 2 years ago

Done!