dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 719 forks source link

Opening the TCP comm port in a browser results in an exception #8905

Open jacobtomlinson opened 1 month ago

jacobtomlinson commented 1 month ago

If you start the scheduler and accidentally open the TCP comm port in a browser instead of the dashboard port you get a confusing pickle message in the browser and an exception in the scheduler.

image
$ dask scheduler
2024-10-24 12:36:30,908 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,259 - distributed.scheduler - INFO - State start
2024-10-24 12:36:31,262 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,263 - distributed.scheduler - INFO -   Scheduler at:   tcp://10.51.100.43:8786
2024-10-24 12:36:31,263 - distributed.scheduler - INFO -   dashboard at:  http://10.51.100.43:8787/status
2024-10-24 12:36:31,263 - distributed.scheduler - INFO - Registering Worker plugin shuffle
2024-10-24 12:36:34,608 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194ae0>, <Task finished name='Task-254' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((6073139484287059271,), dtype('uint8'))>)
Traceback (most recent call last):
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
    gen.convert_yielded(future), lambda f: f.result()
                                           ^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
    await self.on_connection(comm)
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
    return await super().on_connection(comm, handshake_overrides)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
    handshake = await comm.read()
                ^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
    frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
    buf = host_array(n)
          ^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
    return numpy.empty((n,), dtype="u1").data
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 5.27 EiB for an array with shape (6073139484287059271,) and data type uint8
2024-10-24 12:36:34,649 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194cc0>, <Task finished name='Task-257' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((8530211521808319815,), dtype('uint8'))>)
Traceback (most recent call last):
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
    gen.convert_yielded(future), lambda f: f.result()
                                           ^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
    await self.on_connection(comm)
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
    return await super().on_connection(comm, handshake_overrides)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
    handshake = await comm.read()
                ^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
    frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
    buf = host_array(n)
          ^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
    return numpy.empty((n,), dtype="u1").data
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 7.40 EiB for an array with shape (8530211521808319815,) and data type uint8

This is sort of expected because you shouldn't open that port in a browser. But it would be nice if things failed in a better way.

It would be interesting to see if we can detect an HTTP connection and behave differently. For example we could try and show a better error in the browser and avoid going down the code path that raises the exception in the scheduler.