allenai / Holodeck

CVPR 2024: Language Guided Generation of 3D Embodied AI Environments.
https://yueyang1996.github.io/holodeck
Apache License 2.0
304 stars 25 forks source link

Cuda Multiprocessing Issue #20

Closed Singh-sid930 closed 3 months ago

Singh-sid930 commented 5 months ago

I tried running the simple example in the ReadMe which led to this error : RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method I plugged in a torch.multiprocessing.set_start_method('spawn', force = True) under the if name == "main" as per a few post online which led to another error :


[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]```
Singh-sid930 commented 5 months ago

After changing the devices for the cuda tensors and setting multu priocessing to False I was able to fix this. This should really be in the troubleshooting section or just changed in the code.

Singh-sid930 commented 5 months ago


bathtub-0 | edge
Number of solutions found: 2
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 145, in _serve
    send(conn, destination_pid)
  File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 50, in send
    reduction.send_handle(conn, new_fd, pid)
  File "/usr/lib/python3.10/multiprocessing/reduction.py", line 183, in send_handle
    with socket.fromfd(conn.fileno(), socket.AF_UNIX, socket.SOCK_STREAM) as s:
  File "/usr/lib/python3.10/socket.py", line 545, in fromfd
    nfd = dup(fd)
OSError: [Errno 24] Too many open files
Exception in thread Thread-3 (_handle_workers):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 516, in _handle_workers
    cls._maintain_pool(ctx, Process, processes, pool, inqueue,
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 340, in _maintain_pool
    Pool._repopulate_pool_static(ctx, Process, processes, pool,
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 54, in _launch
    child_r, parent_w = os.pipe()
OSError: [Errno 24] Too many open files```

Running into this error consistently when using this command

`python3 main.py --query "a two bedroom apartment" --openai
`
Singh-sid930 commented 5 months ago

for the too many open files isue I had to do ulimit -n 4096 Note that this should be done in the same terminal in which you are running the scripts or just add it to the bashrc.

YueYANG1996 commented 3 months ago

Thanks for raising this issue! I will update the code soon.