nneonneo / iOS-SOCKS-Server

iOS HTTP/SOCKS proxy server for fake-tethering
275 stars 33 forks source link

Add IPv6 Support #14

Closed philrosenthal closed 1 year ago

philrosenthal commented 1 year ago

Since many cellular networks nowadays run IPv6, it seems like this was worthwhile to add proper support for IPv6.

I also made the logging much less chatty.

nneonneo commented 1 year ago

From testing, the old version completes the IPv4 connection nearly instantly (no delay), and the IPv6 connection instantly fails (as opposed to hanging forever):

* Uses proxy env variable http_proxy == 'socks5://192.168.1.171:9876'
*   Trying 192.168.1.171:9876...
* Connected to 192.168.1.171 (192.168.1.171) port 9876 (#0)
* SOCKS5 connect to IPv6 2606:2800:220:1:248:1893:25c8:1946:80 (locally resolved)
* Can't complete SOCKS5 connection to 2606:2800:220:1:248:1893:25c8:1946. (5)
* Closing connection 0
curl: (97) Can't complete SOCKS5 connection to 2606:2800:220:1:248:1893:25c8:1946. (5)

I think this is an unfortunate failure mode, and possibly a bug with the new version that I haven't tracked down yet: if the upstream connection fails, the downstream proxy connection should be terminated immediately so clients don't hang.

philrosenthal commented 1 year ago

I did actually encounter this with curl also. Try using socks5h://(proxy address) That should fix the problem. I’ve been using Firefox which sends requests this same way.

This implementation of IPv6 for curl is wrong. Happy eyeballs requires simultaneous connections to both v4 and v6 with a 300ms delay for the v4 connection and using whichever connects first.

The implementation in this script isn’t perfect but does have a timeout on the v6 side of 5 seconds, and also tests v6 when it first launches and will disable it if it does not work for the rest of the session.

What curl is doing is only sending v6 and waiting for a long tineout. In any event, it’s a good reminder to look into why connecting to IPv6 by address instead of by host does not work.

philrosenthal commented 1 year ago

I did actually encounter this with curl also. Try using socks5h://(proxy address) That should fix the problem. I’ve been using Firefox which sends requests this same way.

This implementation of IPv6 for curl is wrong. Happy eyeballs requires simultaneous connections to both v4 and v6 with a 300ms delay for the v4 connection and using whichever connects first.

The implementation in this script isn’t perfect but does have a timeout on the v6 side of 5 seconds, and also tests v6 when it first launches and will disable it if it does not work for the rest of the session.

What curl is doing is only sending v6 and waiting for a long tineout. In any event, it’s a good reminder to look into why connecting to IPv6 by address instead of by host does not work.

I found/fixed this bug, will post in the next commit.

nneonneo commented 1 year ago

Also, FWIW, Happy Eyeballs v2 (https://www.rfc-editor.org/rfc/rfc8305#section-5) recommends 250ms without RTT data, down to a minimum of 100ms.

I think the code should be refactored slightly to only perform the concurrent execution + delay if both IPv4 and IPv6 addresses are available and both IPv4 and IPv6 interfaces are available. Otherwise, it should straightforwardly attempt a single v4/v6 connection immediately based on what's available.

philrosenthal commented 1 year ago

The current implementation of happy eyeballs does exactly as you say ... if only v4 is available, it will immediately use only that, if only v6 is available, it will immediately use only that. I've also lowered the happy eyeballs advantage to 50ms. 300ms is probably too punitive. I did some testing at various penalties, and 50ms was the minimum level where it was correctly selecting IPv6 the majority of the time, without penalizing IPv4 unnecessarily.

nneonneo commented 1 year ago

I tried the latest version (defa7ad) but I still cannot get it to work over IPv6.

  1. The Public IPv4 comes up correctly and reports my cell connection's ISP and ASN.
  2. The IPv6 address comes up, but then says this:
Will connect to IPv6 servers over interface pdp_ip0 at 2605:[...]:429a
Failed to connect to www.google.com over IPv6 due to: [Errno 8] nodename nor servname provided, or not known
  1. Connecting to any server that has both IPv4 and IPv6 addresses, or only IPv6 addresses, fails:

iPhone:

DEBUG:root:<computer IP>:53865: new connection
DEBUG:root:global: resolving address ip.me
DEBUG:root:<computer IP>:53865 -> {'ipv4': '212.102.35.236', 'ipv6': '2a02:6ea0:c035::11'}:80: failed to connect to 2a02:6ea0:c035::11:80 due to [Errno 65] No route to host
ERROR:root:<computer IP>:53865 -> {'ipv4': '212.102.35.236', 'ipv6': '2a02:6ea0:c035::11'}:80: connect error Failed to connect

Mac:

http_proxy=socks5h://<iPhone IP>:9876 curl 'http://ip.me'

Same thing with socks5://, with example.com and google.com. However, if I try an IPv4 address, e.g. http_proxy=socks5h://<iPhone IP>:9876 curl 'http://93.184.216.34' -H 'Host: example.com' -vvvv, it works fine.

As far as I can tell, the root cause is that it's not making IPv6 connections properly. My WiFi has no IPv6 connectivity, as usual, but the cell interface does, as https://test-ipv6.com/ shows when disabling WiFi.

I believe the socket bind is failing to correctly force the socket to use the cellular interface for IPv6 traffic, but I'm not entirely sure why. I'll keep poking at it.

nneonneo commented 1 year ago

OK, I think I figured it out: I can't use a domain name when connecting; it only works if I resolve them to IPv6 addresses before feeding them to connect.

Consider the following script:

MY_IPv6 = "2605:[...]:429a"

from socket import socket, AF_INET6

# throws [Errno 49] Can't assign requested address
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("google.com", 80))
except Exception as e:
    print(e)

# throws [Errno 8] nodename nor servname provided, or not known
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("ip6only.me", 80))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2607:f8b0:400a:804::200e", 80))
    s.send(b"HEAD / HTTP/1.1\r\nHost: google.com\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2001:4838:0:1b::201", 80))
    s.send(b"GET /api/ HTTP/1.1\r\nHost: ip6only.me\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

Only the last two connection attempts work, and they work perfectly; ip6only.me reports my cell IPv6 address as expected, even though WiFi is enabled etc. The former two fail, and they fail differently: the first with a misleading [Errno 49] Can't assign requested address, and the second with [Errno 8] nodename nor servname provided, or not known (and actually, No route to host the first time I tried it...)

So the solution will be to route everything through the DNS resolver first, and only ever call connect with IP addresses. This includes for things like IP testing and WHOIS queries!

(P.S. is the WHOIS query really necessary? I guess it provides some assurance that the connection is flowing over the user's desired ISP, but I am a bit concerned it'll add a potential point-of-failure and extra latency to startup...)

philrosenthal commented 1 year ago

OK, I think I figured it out: I can't use a domain name when connecting; it only works if I resolve them to IPv6 addresses before feeding them to connect.

Consider the following script:

MY_IPv6 = "2605:[...]:429a"

from socket import socket, AF_INET6

# throws [Errno 49] Can't assign requested address
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("google.com", 80))
except Exception as e:
    print(e)

# throws [Errno 8] nodename nor servname provided, or not known
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("ip6only.me", 80))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2607:f8b0:400a:804::200e", 80))
    s.send(b"HEAD / HTTP/1.1\r\nHost: google.com\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2001:4838:0:1b::201", 80))
    s.send(b"GET /api/ HTTP/1.1\r\nHost: ip6only.me\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

Only the last two connection attempts work, and they work perfectly; ip6only.me reports my cell IPv6 address as expected, even though WiFi is enabled etc. The former two fail, and they fail differently: the first with a misleading [Errno 49] Can't assign requested address, and the second with [Errno 8] nodename nor servname provided, or not known (and actually, No route to host the first time I tried it...)

So the solution will be to route everything through the DNS resolver first, and only ever call connect with IP addresses. This includes for things like IP testing and WHOIS queries!

(P.S. is the WHOIS query really necessary? I guess it provides some assurance that the connection is flowing over the user's desired ISP, but I am a bit concerned it'll add a potential point-of-failure and extra latency to startup...)

RE: Whois - Not really necessary, but I do like it to have assurance that the data is going over the network I want it to. Maybe we can just make it an option to enable it or not if you really feel it shouldn't be there always. I'll make a configuration option to put in the source to disable it, and you can decide if you want it default on or default off. Will do on the next commit.

philrosenthal commented 1 year ago

OK, I think I figured it out: I can't use a domain name when connecting; it only works if I resolve them to IPv6 addresses before feeding them to connect. Consider the following script:

MY_IPv6 = "2605:[...]:429a"

from socket import socket, AF_INET6

# throws [Errno 49] Can't assign requested address
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("google.com", 80))
except Exception as e:
    print(e)

# throws [Errno 8] nodename nor servname provided, or not known
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("ip6only.me", 80))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2607:f8b0:400a:804::200e", 80))
    s.send(b"HEAD / HTTP/1.1\r\nHost: google.com\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

# works fine
try:
    s = socket(AF_INET6)
    s.bind((MY_IPv6, 0))
    s.connect(("2001:4838:0:1b::201", 80))
    s.send(b"GET /api/ HTTP/1.1\r\nHost: ip6only.me\r\n\r\n")
    print(s.recv(4096))
except Exception as e:
    print(e)

Only the last two connection attempts work, and they work perfectly; ip6only.me reports my cell IPv6 address as expected, even though WiFi is enabled etc. The former two fail, and they fail differently: the first with a misleading [Errno 49] Can't assign requested address, and the second with [Errno 8] nodename nor servname provided, or not known (and actually, No route to host the first time I tried it...) So the solution will be to route everything through the DNS resolver first, and only ever call connect with IP addresses. This includes for things like IP testing and WHOIS queries! (P.S. is the WHOIS query really necessary? I guess it provides some assurance that the connection is flowing over the user's desired ISP, but I am a bit concerned it'll add a potential point-of-failure and extra latency to startup...)

RE: Whois - Not really necessary, but I do like it to have assurance that the data is going over the network I want it to. Maybe we can just make it an option to enable it or not if you really feel it shouldn't be there always. I'll make a configuration option to put in the source to disable it, and you can decide if you want it default on or default off. Will do on the next commit.

Regarding the DNS issue you are having ... I'm really unclear what is different about your setup which causes this. I don't have any problem with DNS where I am running it ... and I'm pretty sure that in the pathways of code that it takes for me, it is doing like you say -- sending it to a resolver first and then connecting directly to the ipv4/ipv6 address. Not sure what I can do to solve your problem.

nneonneo commented 1 year ago

Regarding the DNS issue you are having ... I'm really unclear what is different about your setup which causes this. I don't have any problem with DNS where I am running it ... and I'm pretty sure that in the pathways of code that it takes for me, it is doing like you say -- sending it to a resolver first and then connecting directly to the ipv4/ipv6 address. Not sure what I can do to solve your problem.

Diagnosing this a bit more, I found the issue:

Traceback (most recent call last):
  File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/socks5/socks5_ipv6.py", line 234, in <module>
    public_ipv6 = get_public_ip(6)
  File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/socks5/socks5_ipv6.py", line 116, in get_public_ip
    conn.request("GET", parsed_url.path)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 976, in send
    self.connect()
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 942, in connect
    self.sock = self._create_connection(
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/socket.py", line 825, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/socket.py", line 956, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

The core problem appears to be that http.client uses socket.create_connection internally; the latter performs a getaddrinfo without considering the source address. Since my WiFi cannot use IPv6 addresses, the result is a DNS resolution failure. There's no easy fix here; the best approach would be to hand-craft an HTTP request and response.

In general I would prefer to avoid relying on third-party libraries and servers. I would actually recommend dropping the public-IP and WHOIS features for now. Note that WHOIS may very well fail if the primary network, i.e. WiFi, has no usable internet connection, as can happen with ad-hoc networks or heavily firewalled WiFi networks.

Let's keep this PR simple and drop extraneous features. We can add them in a future PR.

philrosenthal commented 1 year ago

Regarding the DNS issue you are having ... I'm really unclear what is different about your setup which causes this. I don't have any problem with DNS where I am running it ... and I'm pretty sure that in the pathways of code that it takes for me, it is doing like you say -- sending it to a resolver first and then connecting directly to the ipv4/ipv6 address. Not sure what I can do to solve your problem.

Diagnosing this a bit more, I found the issue:

  • I'm getting Failed to connect to www.google.com over IPv6 due to: [Errno 8] nodename nor servname provided, or not known
  • However, this error is spurious; the connection to Google is fine, but it's the subsequent get_public_ip(6) call that fails:
Traceback (most recent call last):
  File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/socks5/socks5_ipv6.py", line 234, in <module>
    public_ipv6 = get_public_ip(6)
  File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/socks5/socks5_ipv6.py", line 116, in get_public_ip
    conn.request("GET", parsed_url.path)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 976, in send
    self.connect()
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/http/client.py", line 942, in connect
    self.sock = self._create_connection(
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/socket.py", line 825, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/var/containers/Bundle/Application/E23FE3B6-72C6-4109-8447-BE0CC742E0F3/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/socket.py", line 956, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

The core problem appears to be that http.client uses socket.create_connection internally; the latter performs a getaddrinfo without considering the source address. Since my WiFi cannot use IPv6 addresses, the result is a DNS resolution failure. There's no easy fix here; the best approach would be to hand-craft an HTTP request and response.

In general I would prefer to avoid relying on third-party libraries and servers. I would actually recommend dropping the public-IP and WHOIS features for now. Note that WHOIS may very well fail if the primary network, i.e. WiFi, has no usable internet connection, as can happen with ad-hoc networks or heavily firewalled WiFi networks.

Let's keep this PR simple and drop extraneous features. We can add them in a future PR.

@nneonneo I agree. and I'll remove this in a few commits. It is mostly working, so I'd like to keep it preserved until I've attempted to address the other couple of outstanding issues you mentioned. After that, I'll remove it, and I think it's probably at a decent point to merge back in if you like.

philrosenthal commented 1 year ago

Whois and get_public_ip are removed. CONNECT_HOST dictionary is split up. I've done testing by using Firefox to browse through the proxy, and both IPv4 and IPv6 work reliably for me. I'm satisfied with it as it is for a merge at this point, unless you have any other issues.

Keep in mind, the current version you have posted is not perfect either, and IMHO, this is a huge upgrade even for users who are IPv4-only... And realistically, probably most mobile users at this point will have IPv6 and will get a huge upgrade from that as well.

In any event, if there's any showstoppers, let me know, and I'll take a look... and I'm happy to stay in touch about making some of the other changes we discussed in comments (eg: I'm certainly interested in adding in the ability to help tether to public wifi as well). You seem to be very interested in supporting ipv6-only (which, IMHO, is not going to exist for a long time), but for that use case, the dns resolver needs to add support for IPv6 as well.

nneonneo commented 1 year ago

OK, awesome. I tested it out and everything looked good. Browsing the internet by proxy worked, and it was nice to have an IPv6 connection via my phone despite my computer's lack of one. Merging - thanks for your work on this!

nneonneo commented 1 year ago

@philrosenthal I just did a big rewrite to make everything async. The concurrent.futures approach didn't actually work - futures can't be cancelled once started, but async tasks can be. Please give the new version a try and let me know if it works - I haven't exercised all of the corner cases, but general browsing over both IPv4 and IPv6 seems to work fairly snappily.

philrosenthal commented 1 year ago

Wow, that is an impressively large rewrite in such quick time!

I've just started working on adding in functionality to allow it to tether to wifi (use case: While traveling, I run into wifi which is free/discounted for smartphone but expensive/not available for laptops ... this could be useful for those scenarios).

I've posted a very crude initial version to my github which does work, but is sloppy because it doesn't reuse any code.

I'll test out your code now.

-Phil

On Jul 6, 2023, at 6:29 PM, Robert Xiao @.***> wrote:

@philrosenthal https://github.com/philrosenthal I just did a big rewrite to make everything async. The concurrent.futures approach didn't actually work - futures can't be cancelled once started, but async tasks can be. Please give the new version a try and let me know if it works - I haven't exercised all of the corner cases, but general browsing over both IPv4 and IPv6 seems to work fairly snappily.

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1624389244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLOGNL3CUZ27MF4OKSLXO4347ANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.

philrosenthal commented 1 year ago

I've started my testing by running it on my mac and using firefox (configured to use the proxy for resolving) on 127.0.0.1 ... The previous code worked that way, but ipv4-only.

Looks like there are some big problems in the code still, when testing that way: 127.0.0.1:64998: Exception: Host ('rtb-csync.smartadserver.com', 443) could not be resolved 127.0.0.1:64997: Exception: Host ('dsp.adfarm1.adition.com', 443) could not be resolved 127.0.0.1:65006: Exception: Host ('usersync.gumgum.com', 443) could not be resolved 127.0.0.1:65011: Exception: Host ('sync.outbrain.com', 443) could not be resolved 127.0.0.1:65012: Exception: Host ('usersync.gumgum.com', 443) could not be resolved ERROR:socks5:127.0.0.1:51036: Exception: Host ('mfl.speedtest.sbcglobal.net.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept ERROR:socks5:127.0.0.1:51261: Exception: Host ('speedtest.orld.fl.wtsky.net.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51262: Exception: Host ('mfl.speedtest.sbcglobal.net.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51257: Exception: Host ('speedtest.orld.fl.wtsky.net.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51256: Exception: Host ('speedtest.orld.fl.wtsky.net.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51263: Exception: Host ('speed-server1.summit-broadband.com.prod.hosts.ooklaserver.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51259: Exception: Host ('aax-us-east.amazon-adsystem.com', 443) could not be resolved ERROR:socks5:127.0.0.1:51258: Exception: Host ('qsearch-a.akamaihd.net', 443) could not be resolved ERROR:socks5:127.0.0.1:51255: Exception: Host ('tampp-speedtest-01.noc.bhn.net', 8080) could not be resolved ERROR:socks5:127.0.0.1:51254: Exception: Host ('googleads.g.doubleclick.net', 443) could not be resolved ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:socks5:127.0.0.1:51507: ConnectionResetError: [Errno 54] Connection reset by peer

It does somewhat work (ie: pages do load most of the time), but i was occasionally intermittent browser errors. I will try it on my phone next, as I know that's obviously the primary intended use case ... but given that the previous code would have passed this test, it's probably indicative of some other problems here.

-Phil

On Jul 6, 2023, at 6:36 PM, Phil Rosenthal @.***> wrote:

Wow, that is an impressively large rewrite in such quick time!

I've just started working on adding in functionality to allow it to tether to wifi (use case: While traveling, I run into wifi which is free/discounted for smartphone but expensive/not available for laptops ... this could be useful for those scenarios).

I've posted a very crude initial version to my github which does work, but is sloppy because it doesn't reuse any code.

I'll test out your code now.

-Phil

On Jul 6, 2023, at 6:29 PM, Robert Xiao @.***> wrote:

@philrosenthal https://github.com/philrosenthal I just did a big rewrite to make everything async. The concurrent.futures approach didn't actually work - futures can't be cancelled once started, but async tasks can be. Please give the new version a try and let me know if it works - I haven't exercised all of the corner cases, but general browsing over both IPv4 and IPv6 seems to work fairly snappily.

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1624389244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLOGNL3CUZ27MF4OKSLXO4347ANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.

nneonneo commented 1 year ago

Yikes, that's a lot of errors. Out of curiosity, what does this test look like? I tried opening about 50 concurrent tabs, and doing 1000 curl requests to invalid + valid domains in parallel with speedtest.net running, and could not hit these errors. It kind of looks like you ran out of file descriptors, which is a real possibility with enough simultaneous connections: what did the Connection counter look like?

Maybe the file descriptor limit is too low. What does ulimit -Sa and ulimit -Ha produce? I have 2560 for the soft limit (-Sa, -n: file descriptors). If I use ulimit -Sn 64 to artificially decrease the limit, I hit the errors you described pretty quickly. On Pythonista, a quick test suggests that the limit is also around 2560 (I was able to put 2550 open files in a list before getting an error), although this may be OS/hardware dependent (iOS 16.5.1 on an iPhone 12 Pro Max).

nneonneo commented 1 year ago

If your file descriptor limit is high enough but you are still getting problems, it might be caused by a file descriptor leak. In that case, lsof -p <pid of socks proxy> could help - you should see two socket connections per SOCKS connection, plus around 10 other open files (console, some pipes and other UNIX sockets, listening sockets, etc.). Anything more than that would suggest an fd leak, which would definitely be a bug.

It's very weird that the old code would work, since the old code is not any more economical with fds than the new code. The new DNS logic might result in more UDP sockets temporarily, which could produce failures with a very large influx of DNS requests, but you'd have to be doing a few hundred DNS requests simultaneously to get anywhere near failure.

philrosenthal commented 1 year ago

My test was just loading www.speedtest.net on a single tab of Firefox, nothing else.

I found that some test sources cause it, and others don't... I'm not sure why. AT&T in Miami, FL seems to cause it reliably for me.

My file descriptor limit seems to be 256. I don't recall ever doing anything to set it that way. In any event, the previous version did not cause this problem.

I'll try raising it to 2560 and see if it is more reliable that way.

Command outputs below:

LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 208 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 204 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 205 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 205 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 210 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 204 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 202 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 216 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 216 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 226 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 262 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 262 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 270 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 268 LookingGlass:iOS-SOCKS-Server-master winter$ lsof -p 66730|wc -l 261

LookingGlass:iOS-SOCKS-Server-master winter$ ulimit -Sa core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 256 pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 8176 cpu time (seconds, -t) unlimited max user processes (-u) 10666 virtual memory (kbytes, -v) unlimited LookingGlass:iOS-SOCKS-Server-master winter$ ulimit -Ha core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) unlimited pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 65520 cpu time (seconds, -t) unlimited max user processes (-u) 16000 virtual memory (kbytes, -v) unlimited

On Jul 6, 2023, at 8:06 PM, Robert Xiao @.***> wrote:

Yikes, that's a lot of errors. Out of curiosity, what does this test look like? I tried opening about 50 concurrent tabs, and doing 1000 curl requests to invalid + valid domains in parallel with speedtest.net running, and could not hit these errors. It kind of looks like you ran out of file descriptors, which is a real possibility with enough simultaneous connections: what did the Connection counter look like?

Maybe the file descriptor limit is too low. What does ulimit -Sa and ulimit -Ha produce? I have 2560 for the soft limit (-Sa, -n: file descriptors). If I use ulimit -Sn 64 to artificially decrease the limit, I hit the errors you described pretty quickly. On Pythonista, a quick test suggests that the limit is also around 2560 (I was able to put 2550 open files in a list before getting an error), although this may be OS/hardware dependent (iOS 16.5.1 on an iPhone 12 Pro Max).

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1624455822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLMY6QNS6ORUOOTHWETXO5HIBANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.

nneonneo commented 1 year ago

I guess you could try setting it to 256 explicitly and seeing if you trip the errors on the old version of the proxy...

I think my problem might be my adblocker. I see a lot of advertising domains failed to resolve. I wonder if maybe the ad connections + speedtest pushed the total connection count past ~100, which is the point at which 256 fds would be insufficient.

philrosenthal commented 1 year ago

The old code seems to consistently use fewer sockets than the new code, and also seems to handle going slightly over more gracefully. With the limit set to 256, it goes up to 280 or so, with no errors. Setting it to 128 has it going to 180 or so, and still loads everything fine.

Setting the soft limit to 64 causes errors in both the proxy and the browser.

The new code seems to want to go into the 300-500 range of open sockets just loading speedtest.net

I'm not using an adblocker for any of my testing (though I do normally use one for my normal use case).

I didn't look through your code [at all], but a few possible ideas: 1) The slower connection from happy eyeballs isn't being closed after the faster connection is being accepted (ipv4 or ipv6, whichever is losing the race) 2) Connections aren't being closed after the browser (or server) closes the connection [either on the side which was directly closed, or the corresponding opposite side - eg: client closed, also close the server ... or server closed, also close the client 3) For some reason, the old code was more aggressive about closing sockets?

I looked through the list of sockets open on lsof ... and it looks like all connections are ESTABLISHED, so it doesn't appear that any are sitting stuck open like (2) would suggest, and also the number of local connections (ports 9876) is only slightly lower than external connections (ports 443/80) ... which would suggest 1 is not happening.

Maybe the old code is just more aggressive about closing connections for some reason?

-Phil

On Jul 6, 2023, at 9:55 PM, Robert Xiao @.***> wrote:

I guess you could try setting it to 256 explicitly and seeing if you trip the errors on the old version of the proxy...

I think my problem might be my adblocker. I see a lot of advertising domains failed to resolve. I wonder if maybe the ad connections + speedtest pushed the total connection count past ~100, which is the point at which 256 fds would be insufficient.

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1624526998, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLJZ2NCL3ID6MDUXVNTXO5UAJANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.

nneonneo commented 1 year ago

I think I found and fixed the bug in fb0fecf. I wasn't closing connections properly when receiving RST, which was causing things to break, roughly corresponding to your possibility (2). Give that a spin...

nneonneo commented 1 year ago

@philrosenthal please give the new code a try and let me know if that works better.

philrosenthal commented 1 year ago

Hello,

Apologies, I've been really busy the past week as I was preparing for a closing on a buying a home.

I'll have a lot more time tomorrow to actually look into it, but from an early test (on my iPhone as the SOCKS proxy), but in an early test, I see a lot of "AssertionError" and "Exception: Host could not be resolved".

The cell service where I was testing it is very poor, so it could be related to that, but the previous version (which admittedly does have the logging level turned much lower) doesn't flash any messages like that.

On Jul 10, 2023, at 3:05 PM, Robert Xiao @.***> wrote:

@philrosenthal https://github.com/philrosenthal please give the new code a try and let me know if that works better.

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1629554241, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLJJZ5RF77NATMJZSG3XPRG5ZANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.

philrosenthal commented 1 year ago

Hello,

Running it on my mac, I still get a lot of problems.

Now it is more clear about the problem being running out of open file descriptors: ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files ERROR:asyncio:socket.accept() out of system resource socket: <asyncio.TransportSocket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 9876)> Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py", line 293, in accept OSError: [Errno 24] Too many open files

I'm really confused as to why the asyncio version is so much more problematic about this issue. Raising ulimit to 2560 does seem to resolve that issue.

In addition to that, when running on my iPhone and doing speedtest.net tests, I see a few AssertionError messages, as well as Exception: Host could not be resolved for domains zdbb.net and gurgle.speedtest.net and a few other domains (which I absolutely can resolve using a normal dns lookup).

If you ignore the errors, it does seem to work well enough that the browser seems to be making things work (I suppose just by retrying some failed lookups /failed connections), but there are clearly some minor underlying issues still happening.

The refactoring you've done is huge and I'm going to be a lot busier over the next few weeks with several projects, so I don't have the time to fully understand all the changes made. If you can provide some instructions on where I can add some debugging prints to help isolate some of these issues, I'm happy to help with that.

All-in-all, I know the final non-async version was not perfect, and it seems that this implementation traded some flaws for a different set of flaws.

I know asyncio is very tricky to make work -- and most of my professional work avoids that issue entirely by using either threads or event driven io to avoid the blocking issue entirely while still using "blocking" sockets. In other languages like C and Go, such designs are practical, but I believe that under Python, those designs are not practical.

On Jul 10, 2023, at 10:59 PM, Phil Rosenthal @.***> wrote:

Hello,

Apologies, I've been really busy the past week as I was preparing for a closing on a buying a home.

I'll have a lot more time tomorrow to actually look into it, but from an early test (on my iPhone as the SOCKS proxy), but in an early test, I see a lot of "AssertionError" and "Exception: Host could not be resolved".

The cell service where I was testing it is very poor, so it could be related to that, but the previous version (which admittedly does have the logging level turned much lower) doesn't flash any messages like that.

On Jul 10, 2023, at 3:05 PM, Robert Xiao @.***> wrote:

@philrosenthal https://github.com/philrosenthal please give the new code a try and let me know if that works better.

— Reply to this email directly, view it on GitHub https://github.com/nneonneo/iOS-SOCKS-Server/pull/14#issuecomment-1629554241, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5WPLJJZ5RF77NATMJZSG3XPRG5ZANCNFSM6AAAAAAZX33V2A. You are receiving this because you were mentioned.