adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.13k stars 1.22k forks source link

ESP LWIP network stack cannot handle 3 binds correctly. #8363

Closed bill88t closed 7 months ago

bill88t commented 1 year ago

CircuitPython version

Adafruit CircuitPython 9.0.0-alpha.1-25-g000d22f25 on 2023-08-30; VCC-GND YD-ESP32-S3 (N16R8) with ESP32S3

Code/REPL

import wifi
from socketpool import SocketPool
from adafruit_requests import Session

pool = SocketPool(wifi.radio)
_socket = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket2 = pool.socket(pool.AF_INET, pool.SOCK_STREAM)

_socket.bind(("0.0.0.0", 20))
_socket.listen(1)
_socket2.bind(("0.0.0.0", 21))
_socket2.listen(1)
a = _socket.accept()
b = _socket2.accept()
a.close()
b.close()
print("ok")

Behavior

If we were to just telnet into both, the program would print "ok" and close both connections. However, we can only connect on the first. The second is stuck at SYN_WAIT.

Description

No response

Additional information

This is needed to properly implement PASV for ftp. ACTIVE with a bind & a connection, works just fine.

Reproducible on S2 too.

anecdata commented 1 year ago

Had to tweak the close(). Then it works on raspberrypi, but espressif gets stuck in _socket2.accept() (client times out on connect).

server code.py ```py import wifi from socketpool import SocketPool pool = SocketPool(wifi.radio) _socket = pool.socket(pool.AF_INET, pool.SOCK_STREAM) _socket2 = pool.socket(pool.AF_INET, pool.SOCK_STREAM) _socket.bind(("0.0.0.0", 20)) _socket.listen(1) _socket2.bind(("0.0.0.0", 21)) _socket2.listen(1) a = _socket.accept() b = _socket2.accept() a[0].close() b[0].close() print("ok") ```
anecdata commented 1 year ago

works in asyncio on both platforms:

server code.py ```py import asyncio import wifi import socketpool PORT1 = 20 PORT2 = 21 async def tcpserver(PORT): s = pool.socket(pool.AF_INET, pool.SOCK_STREAM) s.bind(("", PORT)) s.listen(1) s.settimeout(0) while True: try: conn, addr = s.accept() print(f"{PORT} OK {addr}") conn.close() except OSError: # EAGAIN pass await asyncio.sleep(0) pool = socketpool.SocketPool(wifi.radio) async def main(): t1 = asyncio.create_task(tcpserver(PORT1)) t2 = asyncio.create_task(tcpserver(PORT2)) await asyncio.gather(t1, t2) asyncio.run(main()) ```
CPython client code for both cases above ```py #!/usr/bin/env python3 import socket import time import random # edit host and port to match server HOST = "192.168.6.198" PORTS = (20, 21) TIMEOUT = 5 INTERVAL = 1 while True: PORT = random.choice(PORTS) # just for fun print("Create TCP Client Socket") with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: s.settimeout(TIMEOUT) print("Connecting") s.connect((HOST, PORT)) size = s.send(b'Hello, world') print(f"Sent {size} bytes to {HOST}:{PORT}") time.sleep(INTERVAL) ```

edit: I guess b/c only one is connected at a time, may be a useful workaround

bill88t commented 1 year ago

watch -n0.2 'netstat | grep "board-ip-here"' to monitor all connections and their status. The second one, as stated above is stuck at SYN_WAIT. It could just be a oopsie, sending the SYN response to the wrong connection for all I know.

Regarding asyncio, I drafted this:

import asyncio
import wifi
import socketpool
from sys import stdout

conn1 = None
conn2 = None

async def tcpserver1():
    s = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
    s.bind(("0.0.0.0", 20))
    s.listen(1)
    s.settimeout(10)
    conn1, addr = s.accept()
    print(f"{conn1} OK {addr}")

async def tcpserver2():
    s = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
    s.bind(("0.0.0.0", 21))
    s.listen(1)
    s.settimeout(10)
    conn2, addr = s.accept()
    print(f"{conn2} OK {addr}")

async def tcpserver1close():
    conn1.close()
    print("Closed 1")

async def tcpserver2close():
    conn2.close()
    print("Closed 2")

pool = socketpool.SocketPool(wifi.radio)

async def main():
    t1 = asyncio.create_task(tcpserver1())
    t2 = asyncio.create_task(tcpserver2())
    t3 = asyncio.create_task(tcpserver1close())
    t4 = asyncio.create_task(tcpserver2close())
    await asyncio.gather(t1, t2)
    await asyncio.gather(t3, t4)

asyncio.run(main())

image

Which as you see, also doesn't work.

Something to note:

import wifi
from socketpool import SocketPool
from adafruit_requests import Session

pool = SocketPool(wifi.radio)
_socket = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket2 = pool.socket(pool.AF_INET, pool.SOCK_STREAM)

_socket.bind(("0.0.0.0", 20))
_socket.listen(1)
_socket2.bind(("0.0.0.0", 20))
_socket2.listen(1)
print("Accepting 1")
a = _socket.accept()
print("Accepting 2")
b = _socket2.accept()
print("Accepted 2")
_socket.close()
print("ok")

Is also a nono. But:

import wifi
from socketpool import SocketPool
from adafruit_requests import Session

pool = SocketPool(wifi.radio)
_socket = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket2 = pool.socket(pool.AF_INET, pool.SOCK_STREAM)

_socket.bind(("0.0.0.0", 20))
_socket.listen(2)
print("Accepting 1")
a = _socket.accept()
print("Accepting 2")
b = _socket.accept()
print("Accepted 2")
_socket.close()
print("ok")

Works just fine..

Auto-reload is off.
code.py output:
Accepting 1
Accepting 2
Accepted 2
ok

Code done running.

And dolphin w/PASV doesn't work with it, even if the cli does. So it can't be used as a workaround.

PASSIVE (PASV)

            This command requests the server-DTP to "listen" on a data
            port (which is not its default data port) and to wait for a
            connection rather than initiate one upon receipt of a
            transfer command.  The response to this command includes the
            host and port address this server is listening on.

Passing everythin over the control port isn't something ftp is designed for it so seems.

anecdata commented 1 year ago

There does seem to be an espressif bug when attempting simultaneous TCP connections (2 accepts in a row on different ports, without closing either) - your 1st example above and the original example.

I fully expect sequential (close before next connect) connections (to the same or different ports) to work (your 3rd example, and my asyncio example). I wouldn't expect 2 servers on the same port to work (your 2nd example).

But PASV doesn't work with sequential connection to the control port, then the data port?

dhalbert commented 1 year ago

There does seem to be an espressif bug when attempting simultaneous TCP connections (2 accepts in a row on different ports, without closing either)

I think it would be good to test this with, say, MicroPython on some ESP32xx board to see if it is our issue or is ESP-IDF. Also maybe a quick search of the ESP-iDF issues.

bill88t commented 1 year ago

But PASV doesn't work with sequential connection to the control port, then the data port?

PASV tl;dr workflow explanation:

  1. Open port 21. This is the control connection.
  2. User connects on 21.
  3. User authenticates.
  4. User sends the PASV command to start the data socket.
  5. Server decides on a port to use, and opens it up (.listen()).
  6. Server sends the RFC defined formatted reply to the client over the control connection.
  7. Client immediately connects to the ip:port the server sent in step 6. This is the data connection.
  8. Client sends (over control connection) a command that needs a data connection, for example LIST.
  9. Server sends ok over control connection and then sends all the data over the data connection.
  10. Server closes the data connection and deinit's the socket, signaling the transaction is done.
  11. Client continues sending other commands, or PASV again.

This whole time, the control connection remains open and in use. If we close it, the client aborts. I did try.

bill88t commented 1 year ago

New discovery!

If the web workflow is disabled, you can do 2 binds. So in reality you cannot do 3 binds.

import wifi
from socketpool import SocketPool
from sys import exit

try:
    wifi.radio.connect("Thinkpood", "REDACTED")
except:
    pass
if not wifi.radio.connected:
    print("No wifi")
    exit(0)

pool = SocketPool(wifi.radio)
_socket = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket.bind(("", 20))
_socket.listen(1)
_socket2 = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket2.bind(("", 21))
_socket2.listen(1)

print("Accepting 1")
a = _socket.accept()
print("Accepting 2")
b = _socket2.accept()
print("Accepted 2")

_socket.close()
_socket2.close()
print("ok")

image

bill88t commented 1 year ago

Taking the same code and adding:

_socket3 = pool.socket(pool.AF_INET, pool.SOCK_STREAM)
_socket3.bind(("", 22))

in it, is all that is required to trigger the bug. .listen() does not affect it.

This gives me some good clues as to where to look.

bill88t commented 1 year ago

I did plently of esp debugging. I only however managed to down as far as lwip_accept in ports/espressif/common-hal/socketpool/Socket.c:272. With esp_log we can see there, the 2nd/3rd socket not being accepted. I tried going deeper, but the internal logging of lwip doesn't get printed no matter what I do. Importing esp_log, is a mess I cannot figure out.

I will have to leave it you you guys from there on out. (Perhaps C3 can prove itself not being a paperbrick and help with some jtag?)

bill88t commented 1 year ago

I went and put micropython 1.20 on one of my s2 boards, and this is not reproducible. I opened 4 sockets and connected to them successfully.

bill88t commented 7 months ago

This issue is fully resolved.