IndySockets / Indy

Indy - Internet Direct
https://www.indyproject.org
461 stars 155 forks source link

IdTCPClient ConnectTimeout broken on linux >6.1.32 (potentially affects higher-ups like TIdHTTP* ? ) #491

Closed kralo closed 1 year ago

kralo commented 1 year ago

When trying to connect to a port that silently drops the SYN packets (i.e. doesn't answer), this program used to work

program conecttimeout;

{$mode objfpc}{$H+}

uses
  {$IFDEF UNIX}{$IFDEF UseCThreads}
  cthreads,
  {$ENDIF}{$ENDIF}
  Classes,
  IdBaseComponent,
  IdTCPClient;

var
  aIdClient:TIdTcpClient;
begin

  aIdClient := TIdTCPClient.Create;
  try
    aIdClient.ConnectTimeout:=3000; // ms
    aIdClient.ReadTimeout:=200; //ms
    // for testing do:   iptables -I INPUT -p tcp --dport 8083 -j DROP
    aIdClient.Connect('127.0.0.1',8083);
    aIdClient.Disconnect;
  finally
  end;
  aIdClient.free;
end.

On Linux up to (including) commit https://github.com/torvalds/linux/commit/6ffc57ea004234d9373c57b204fd10370a69f392 it works as expected, waiting 3 seconds and then stopping:

$ strace -f -e trace=open,socket,connect,bind -t ./connecttimeout
09:46:57 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:46:57 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1163 attached
[pid  1163] 09:46:57 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNRESET (Die Verbindung wurde vom Kommunikationspartner zurückgesetzt)
[pid  1163] 09:47:00 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E2A1C:
EIdConnectTimeout: Connect timed out.
  $00000000004E2A1C  DOCONNECTTIMEOUT,  line 328 of ../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E1D20  OPEN,  line 369 of ../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DA98  CONNECT,  line 328 of ../lib/indy10/Core/IdTCPClient.pas
  $000000000045E3A4  CONNECT,  line 514 of ../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 20 of connecttimeout.lpr
  $0000007F80782E18
  $0000000000400928

09:47:02 +++ exited with 217 +++

However, with commit https://github.com/torvalds/linux/commit/4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c the timeout mechanism is broken; it now waits around 2 minutes, which is roughly the same that wget would. net.ipv4.tcp_syn_retries = 6 so I assume this is expected from retries after 2+4+8+16+32+64=126 seconds.

$ strace -f -e trace=open,socket,connect,bind -t ./connecttimeout
09:36:50 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:36:50 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1107 attached
[pid  1107] 09:36:50 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ETIMEDOUT (Die Wartezeit für die Verbindung ist abgelaufen)
[pid  1107] 09:38:59 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E2A1C:
EIdConnectTimeout: Connect timed out.
  $00000000004E2A1C  DOCONNECTTIMEOUT,  line 328 of ../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E1D20  OPEN,  line 369 of ../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DA98  CONNECT,  line 328 of ../lib/indy10/Core/IdTCPClient.pas
  $000000000045E3A4  CONNECT,  line 514 of ../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 20 of connecttimeout.lpr
  $0000007F936D5E18
  $0000000000400928

09:39:01 +++ exited with 217 +++
~/sockettest $ strace -f -e trace=open,socket,connect,bind -t wget 127.0.0.1:8083
--2023-07-18 09:40:24--  http://127.0.0.1:8083/
Verbindungsaufbau zu 127.0.0.1:8083 … 09:40:24 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:40:24 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ETIMEDOUT (Die Wartezeit für die Verbindung ist abgelaufen)
fehlgeschlagen: Die Wartezeit für die Verbindung ist abgelaufen.
Erneuter Versuch.

--2023-07-18 09:42:35--  (Versuch: 2)  http://127.0.0.1:8083/
Verbindungsaufbau zu 127.0.0.1:8083 … 09:42:35 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
09:42:35 connect(3, {sa_family=AF_INET, sin_port=htons(8083), sin_addr=inet_addr("127.0.0.1")}, 16^C

I can only guess that the new linux code, that looks for waiting threads now interferes with Indy's timeout-counting mechanism.

What can be done?

debian linux/aarch64 fpc 2.0.10+dfsg-4+b2 , Indy Version https://github.com/IndySockets/Indy/commit/5bcd9ce2466ac08729203f6db6a9ad0dc10d5390

rlebeau commented 1 year ago

I'm not familiar with how the Linux kernel works, but Indy has never relied on OS-level timeouts for its ConnectTimeout handling, especially since there is no standard socket option to specify a timeout for connect(). As Indy uses blocking sockets, it runs connect() in its own worker thread, waiting up to ConnectTimeout for that thread to terminate. If the thread does not terminate in time, shutdown() is then called to abort the connection in progress, and close() to close the socket. So, the only hangup should be if connect() is not respecting shutdown()/close() in a timely manner.

That being said, there is an open ticket to re-design ConnectTimeout handling to use a non-blocking connect() with select()/(e)poll() on platforms which support that option. But that is work for a future release.

kralo commented 1 year ago

So, the only hangup should be if connect() is not respecting shutdown()/close() in a timely manner.

Yes, and as I understand, the kernel now prohibits shutting down the connection attempt, and Indy's mechanism is now useless.

https://github.com/torvalds/linux/blob/4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c/net/ipv4/tcp.c#L3084-L3088

shutdown() does not work... further down TIdStackUnix.WSShutdown returns -1 . So the thread keeps waiting until eventually the OS decides to give up on connect()

rlebeau commented 1 year ago

Yes, and as I understand, the kernel now prohibits shutting down the connection attempt, and Indy's mechanism is now useless.

Oh, well, that would certainly suck :-( That doesn't seem like a very good change for them to make in the kernel, and it certainly isn't documented behavior, AFAICS.

https://github.com/torvalds/linux/blob/4faeee0cf8a5d88d63cdbc3bab124fb0e6aed08c/net/ipv4/tcp.c#L3084-L3088

shutdown() does not work... further down TIdStackUnix.WSShutdown returns -1 . So the thread keeps waiting until eventually the OS decides to give up on connect()

Then I guess I need to bump up the priority on #3.

kralo commented 1 year ago

I have verified it working again. Without changing anything in freepascal, it works in linux 6.6.2 and 6.1.63. (timeout in the program is 3000ms, see above).

Most probably fixed by torvalds/linux@419ce133ab928ab5efd7b50b2ef36ddfd4eadbd2 , as mentioned in #493 .

$ uname -a
Linux 6.6.2-v8+ #1 SMP PREEMPT Tue Nov 21 22:28:24 UTC 2023 
aarch64 GNU/Linux

and 

$ uname -a
Linux 6.1.63-v8+ #1700 SMP PREEMPT Thu Nov 23 14:10:01 GMT 2023 
aarch64 GNU/Linux

/dev/shm $  strace -r -f -e trace=open,socket,connect,bind 
 ./conecttimeout
      0.000000 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
     0.000889 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
strace: Process 1162 attached
[pid  1162]      0.002929 connect(3, {sa_family=AF_INET, sin_port=htons(8090), sin_addr=inet_addr("192.168.1.184")}, 16) = -1 ECONNRESET 
[pid  1162]      3.008621 +++ exited with 0 +++
An unhandled exception occurred at $00000000004E40A4:
EIdConnectTimeout: Connect timed out.
  $00000000004E40A4  DOCONNECTTIMEOUT,  line 328 of ../../lib/indy10/Core/IdIOHandlerStack.pas
  $00000000004E32CC  OPEN,  line 369 of ../../lib/indy10/Core/IdIOHandlerSocket.pas
  $000000000045DFC8  CONNECT,  line 328 of ../../lib/indy10/Core/IdTCPClient.pas
  $000000000045E90C  CONNECT,  line 514 of ../../lib/indy10/Core/IdTCPClient.pas
  $00000000004009F0  main,  line 22 of ../../../../../dev/shm/conecttimeout
  $0000007F9AF1EE18
  $0000000000400928

     1.503324 +++ exited with 217 +++