Add the option to set socket.TCP_KEEPIDLE and socket.TCP_KEEPINTVL

mtoma commented 1 year ago

Describe the feature

With the botocore.config.Config option tcp_keepalive=True the TCP socket is opened with the keep alive socket option (socket.SO_KEEPALIVE) but the interval used for TCP keepalive probes is taken from the system default values in the proc filesystem /proc/sys/net/ipv4/tcp_keepalive_time on Linux.

Changing this value requires the root access which is often not available or is just read-only as in the AWS Fargate containers.

The linux default value for this parameter is 7200 seconds which exceeds by far the AWS VPC timeout of 350 seconds which makes boto3 invoke() call to start a Lambda function in synchronous mode (RequestResponse) loose contact with the Lambda backend ent times out without getting back any response if the Lambda execution time exceeds 350 seconds and is started from a VPC.

Use Case

It is currently impossible to get an answer from a Lambda function:

which is started in synchronous mode (InvocationType='RequestResponse')
inside a VPC (which cuts inactive connections after 350 seconds)
and with an execution time greater than 350 seconds

The invoke() call times out after read_timeout=XXX seconds because the TCP connection is lost.

I want this call to successfully finish maintaining the TCP connection alive with TCP probes sent with a user defined interval.

Proposed Solution

The only workaround today is to override the _compute_socket_options private method of the botocore.args.ClientArgsCreator class.

The botocore.config.Configclass should accept an additional parameter (for example tcp_keepalive_time) to allow the user to set the value for (on linux, but is different on OSX and possibly windows) (socket.SOL_TCP, socket.TCP_KEEPIDLE, 60), (socket.SOL_TCP, socket.TCP_KEEPINTVL, 60) parameters.

Other Information

The current workaround is quite ugly:

# Create a nev version of the botocore private method _compute_socket_options:

original_compute_socket_options = botocore.args.ClientArgsCreator._compute_socket_options
def add_aditionnal_tcp_keepalive_options(self, scoped_config, client_config=None):
    print('add_aditionnal_tcp_keepalive_options')
    return original_compute_socket_options(self, scoped_config, client_config) + [(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60), (socket.SOL_TCP, socket.TCP_KEEPINTVL, 60)]

# Override the class method before the invoke call:

botocore.args.ClientArgsCreator._compute_socket_options = add_aditionnal_tcp_keepalive_options
client = boto3.client('lambda', config=aws_cfg)
response = client.invoke(
    FunctionName=target_environment_lambda_name,
    InvocationType='RequestResponse',
    Payload=json.dumps(lambda_params),
    LogType='Tail',
)

Acknowledgements

[X] I may be able to implement this feature request
[ ] This feature might incur a breaking change

SDK version used

{'boto3_version': '1.26.7', 'botocore_version': '1.29.7'}

Environment details (OS name and version, etc.)

Amazon Linux

tim-finnigan commented 1 year ago

Thanks @mtoma for creating the feature request. Could you elaborate a bit more on why configuring tcp_keepalive isn't sufficient for this use case? In this Knowledge Center post it sounds like that is the recommended solution: https://repost.aws/knowledge-center/lambda-vpc-timeout.

Also linked in that post is another one on troubleshooting retry/timeout issues in Lambda that may also be relevant here: https://repost.aws/knowledge-center/lambda-function-retry-timeout-sdk.

mtoma commented 1 year ago

Sure, the current socket options only sets the keepalive socket flag. The TCP probes are still sent using the OS default parameter /proc/sys/net/ipv4/tcp_keepalive_time as explained here:

https://aws.amazon.com/blogs/networking-and-content-delivery/implementing-long-running-tcp-connections-within-vpc-networking/

Also I did all my tests with the tcp_keepalive=True parameter set and validated it is still loosing connection with the Lambda backend after exactly 350 seconds. I fired a AWS support ticket and got confirmation from the AWS support team that the way to go would be to change the /proc/sys/net/ipv4/tcp_keepalive_time kernel parameter which is unfortunately read only on Fargate containers and will also be in environments where root access is impossible.

It should be feasible to setup a test environment for this issue using an EC2 in a VPC and let it call a Lambda that sleeps for 360 seconds.

tim-finnigan commented 1 year ago

Thanks for following up. Regarding this:

I fired a AWS support ticket and got confirmation from the AWS support team that the way to go would be to change the /proc/sys/net/ipv4/tcp_keepalive_time kernel parameter which is unfortunately read only on Fargate containers and will also be in environments where root access is impossible.

Since this is a limitation on Fargate's side, it sounds like something that should be addressed on that end. I found an issue for this here: https://github.com/aws/containers-roadmap/issues/460. It is marked as coming-soon on the containers roadmap — I recommend subscribing to that issue for notifications. I don't think this is likely to get addressed in botocore if an implementation is already in the works for Fargate.

mtoma commented 1 year ago

The problem is this issue is not related to Fargate at all, it is only an example of infrastructure where it happens and is impossible to fix using a system wide parameter. Another maybe much more realistic use case is calling a Lambda from another Lambda when the caller Lambda is in a VPC.

The real point here is that without this option the tcp_keepalive=True seems to me entirely useless. On linux the default value for /proc/sys/net/ipv4/tcp_keepalive_time is 7200 seconds. This means the first TCP probe will be sent 7200 seconds after the invoke() call in about all use cases. This obviously seems perfectly pointless when the max execution time of a Lambda function is 15 minutes.

From the parameter setting tcp_keepalive=True I'm expecting "don't let my TCP connection time out by sending TCP probes to keep it alive". This doesn't happen.

If this isn't the purpose of the tcp_keepalive=True parameter than I don't really understand what it was designed for as I don't see any use case where True/False would make any difference. Is there some use case I'm not aware of?

Otherwise I don't understand why It should be required from the user to change a systemwide kernel parameter that could have massive impact on the the underlying system if is simply possible to set it per socket connection with (socket.SOL_TCP, socket.TCP_KEEPIDLE, 60), (socket.SOL_TCP, socket.TCP_KEEPINTVL, 60)

neilferreira commented 1 year ago

Since this is a limitation on Fargate's side, it sounds like something that should be addressed on that end.

Just to pile on to this but it seemsl like @mtoma has this covered, any infrastructure deployed in a private subnet that communicates with the internet via a NAT Gateway is impacted by this. The NAT Gateway will drop a KeepAlive unexpectedly after 350 seconds, and an attempt by your application to re-use a connection will fail unexpectedly

ref: https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html#nat-gateway-troubleshooting-timeout

tim-finnigan commented 1 year ago

Thanks for following up here and sharing more info. I brought this feature request up for discussion with the team, and the consensus was there are some valid points made here that are worth further investigation.

One of my colleagues also found this blog post on implementing long-running TCP Connections within VPC networking which provides some more context around the issue. https://aws.amazon.com/blogs/networking-and-content-delivery/implementing-long-running-tcp-connections-within-vpc-networking/.

For now we can use this issue to track +1 (👍)s, use cases, and possible approaches to implementation.

Samuelstephenr commented 9 months ago

Much needed feature. Tweaking the os level configuration for a lambda code can be impactful.

neilferreira commented 9 months ago

FWIW @Samuelstephenr this is what I'm doing to avoid patching the entire Lambda OS-level configuration.

import logging
import socket

import requests
from urllib3.connection import HTTPConnection

logger = logging.getLogger()

# Override the HTTPConnection.default_socket_options to discard a Keep-Alive after 300 seconds, since
# AWS NAT Gateway will discard the connection after 350 seconds
# Nice explainer on https://aws.amazon.com/premiumsupport/knowledge-center/lambda-vpc-timeout/
# Variables exlained on https://man7.org/linux/man-pages/man7/tcp.7.html
api_socket_options = HTTPConnection.default_socket_options
try:
    api_socket_options = api_socket_options + [
        (socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
        (socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 300),
        (socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 20),
        (socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5),
    ]
except AttributeError as e:
    logger.warning(
        f"Not patching api_socket_options as this operating system could not use a given socket attribute. {e}"
    )

adapter = requests.adapters.HTTPAdapter()
adapter.init_poolmanager(
    connections=requests.adapters.DEFAULT_POOLSIZE,
    maxsize=requests.adapters.DEFAULT_POOLSIZE,
    block=requests.adapters.DEFAULT_POOLBLOCK,
    socket_options=api_socket_options,
)

request_session = requests.Session()
request_session.mount("https://", adapter)

request_session.post("url..")

amberkushwaha commented 3 months ago

Add the option to set socket in the value binding behaviour of the file in the section policy and proper binding status docs for contact and issue bending behaviour of the file and more prompt issue factors in it.

boto / botocore