OSError No buffer space available during firmware update / large data transfer with socketcan

Occasionally during a firmware update Yakut will crash and the socket that pycyphal has created will claim that there is "No buffer space available" aka ENOBUFS.

Traceback:

2022-07-05 18:02:02 0101068 ERR pycyphal.presentation._port._server: Server(dtype=uavcan.file.Read.1.1, input_transport_session=CANInputSession(InputSessionSpecifier(data_specifier=ServiceDataSpecifier(service_id=408, role=<Role.REQUEST: 1>), remote_node_id=None), PayloadMetadata(extent_bytes=300))) task failure: [Errno 105] No buffer space available
Traceback (most recent call last):
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/presentation/_port/_server.py", line 201, in task_function
    await self.serve_for(handler, _LISTEN_FOREVER_TIMEOUT)
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/presentation/_port/_server.py", line 182, in serve_for
    return await self.serve(handler, monotonic_deadline=loop.time() + timeout)
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/presentation/_port/_server.py", line 170, in serve
    await self._do_send(response, meta, response_transport_session, loop.time() + self._send_timeout)
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/presentation/_port/_server.py", line 309, in _do_send
    return await session.send(transfer, monotonic_deadline)
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/transport/can/_session/_output.py", line 252, in send
    return await self._do_send(can_id, transfer, monotonic_deadline)
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/transport/can/_session/_output.py", line 163, in _do_send
    if await self._send_handler(transaction):
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/transport/can/_can.py", line 329, in _do_send
    num_sent = await self._maybe_media.send(
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/transport/can/media/socketcan/_socketcan.py", line 153, in send
    raise err
  File "/home/alex/.local/share/virtualenvs/uavcan_v1-Kt-sRht1/lib/python3.9/site-packages/pycyphal/transport/can/media/socketcan/_socketcan.py", line 139, in send
    await asyncio.wait_for(
  File "/usr/lib64/python3.9/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
  File "/usr/lib64/python3.9/asyncio/selector_events.py", line 441, in sock_sendall
    n = sock.send(data)
OSError: [Errno 105] No buffer space available

I have found that patching pycyphal's transport/can/media/socketcan/_socketcan.py in _make_socket with

_SO_SNDBUF = 7
s.setsockopt(socket.SOL_SOCKET, _SO_SNDBUF, 1)

allows updates (and I assume other large data transfers) to occur with no issues from my testing. I assume a proper fix using this method would probably set SNDBUF to something greater than 1 though.

There is some discussion of this issue in section 3.4 of http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf, although I found that I did not have to adjust the default queue length of 10 in my case even with using the minimum SNDBUF.

A couple other ways that this could be worked around that I found are increasing the default queue length (although is just hiding the issue?) or adding some retry logic to the send function when OSError's errno == errno.ENOBUFS.

Any thoughts?

OpenCyphal / pycyphal

OSError No buffer space available during firmware update / large data transfer with socketcan #233