When using the botocore.config.Config option tcp_keepalive=True, the TCP socket is configured with the keep alive socket option (socket.SO_KEEPALIVE). By default, Linux sets the TCP keepalive time parameter to 7200 seconds, which exceeds the AWS NAT Gateway default timeout of 350 seconds [source].
This limitation leads to an inability to receive a response from a Lambda function under the following conditions:
The Lambda function is invoked in synchronous mode (InvocationType='RequestResponse').
The invocation occurs within VPC where a NAT gateway is required to access the internet from a private subnet.
The execution time of the Lambda function exceeds 350 seconds.
Therefore, by configuring socket.TCP_KEEPIDLE, socket.TCP_KEEPINTVL and socket.TCP_KEEPCNT when tcp_keepalive during the _compute_socket_options function call we can overcome this limitation.
socket.IPPROTO_TCP is used to support cross platform compatibility.
The code submitted automatically calculates these values based on the read timeout. Another option would be to have supplied in the scope/client object.
When using the botocore.config.Config option tcp_keepalive=True, the TCP socket is configured with the keep alive socket option (
socket.SO_KEEPALIVE
). By default, Linux sets the TCP keepalive time parameter to 7200 seconds, which exceeds the AWS NAT Gateway default timeout of 350 seconds [source].This limitation leads to an inability to receive a response from a Lambda function under the following conditions:
Therefore, by configuring
socket.TCP_KEEPIDLE
,socket.TCP_KEEPINTVL
andsocket.TCP_KEEPCNT
whentcp_keepalive
during the_compute_socket_options
function call we can overcome this limitation.socket.IPPROTO_TCP
is used to support cross platform compatibility.The code submitted automatically calculates these values based on the read timeout. Another option would be to have supplied in the scope/client object.
Fixes issues: https://github.com/boto/boto3/issues/2424, https://github.com/boto/boto3/issues/2510 and https://github.com/boto/botocore/issues/2916.
Fargate recently had a similar solution implemented to support this use case: https://aws.amazon.com/blogs/containers/announcing-additional-linux-controls-for-amazon-ecs-tasks-on-aws-fargate/.