Long TMPDIR breaks multiprocessing

lmoureaux commented 3 months ago

Bug description

On slurm, law explicitly changes TMPDIR to a place within the job's own directory. This breaks packages relying on the default arguments of multiprocessing.Listener because this involves creating a UNIX socket in TMPDIR, and socket paths can only be so long.

It seems multiprocessing.Queue exposes the issue, but cannot 100% confirm.

Example stack trace (followed by a deadlock):

Traceback (most recent call last):
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/queues.py", line 244, in _feed
    obj = _ForkingPickler.dumps(obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 569, in reduce_storage
    df = multiprocessing.reduction.DupFd(fd)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/reduction.py", line 198, in DupFd
    return resource_sharer.DupFd(fd)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/resource_sharer.py", line 53, in __init__
    self._id = _resource_sharer.register(send, close)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/resource_sharer.py", line 76, in register
    self._start()
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/resource_sharer.py", line 126, in _start
    self._listener = Listener(authkey=process.current_process().authkey, backlog=128)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/connection.py", line 464, in __init__
    self._listener = SocketListener(address, family, backlog)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mourelou/micromamba/envs/case/lib/python3.11/multiprocessing/connection.py", line 608, in __init__
    self._socket.bind(address)
OSError: AF_UNIX path too long

riga commented 2 months ago

Thanks for the info @lmoureaux !

IMHO, this is indeed an odd constraint of unix sockets (the limit is 107 bytes if I recall correctly).

It might depend on the actual cluster configuration, but afaik slurm jobs use the submission directory (usually transparently seen by the submission node) as the base for temporary files, which

can also cause socket paths to be too long (depending on where your submission directory is located)
and definitely will lead to a heavy use of the file system under the submission directory.

That's why law changes the default to a place that is cleaned up after the job terminates, but I think we should change that behavior for slurm.

A temporary fix on your side could be along these lines: https://github.com/columnflow/columnflow/blob/835020b57280947f500e232241b014131735d8f2/columnflow/tasks/framework/remote.py#L813-L818

However, I will make sure something similar is done centrally within law.

riga commented 2 months ago

The fix I referenced above is generic enough, so I simply moved it to the slurm base workflow. Reopen the issue in case the issue persists :+1:

lmoureaux commented 2 months ago

Thanks! I agree that the limit on UNIX sockets is weird, but it's baked into socket sockaddr_un and it would be a major change to make it longer... For backward compatibility one would basically need to duplicate the whole API.

riga / law

Long TMPDIR breaks multiprocessing #184

Bug description