aws / aws-lambda-ruby-runtime-interface-client

Other
73 stars 31 forks source link

Unhandled SIGTERM on Lambda shutdown #28

Open plukevdh opened 1 year ago

plukevdh commented 1 year ago

I'm not sure this is really a bug so much as a question of the appropriate way to handle.

Background

This repo notes that the Lambda runtime sends a SIGTERM when shutting down a lambda instance and that such a signal needs to be handled from within the handler runtime. When using one of the AWS-provided runtimes (Ruby in this case) we've not seen the SIGTERM bubble to the surface. However, after we started building our own images and using the RIC to manage the Lambda API interactions, we've noticed that we get a stacktrace that terminates entirely outside of our code, specifically in the code polling for another invocation.

The Issue

The error in question (excluding some implementation specific details) is as follows:

SignalException: SIGTERM
/usr/share/ruby3.2/net/protocol.rb:229:in `wait_readable': SIGTERM (SignalException)
    from /usr/share/ruby3.2/net/protocol.rb:229:in `rbuf_fill'
    from /usr/share/ruby3.2/net/protocol.rb:199:in `readuntil'
    from /usr/share/ruby3.2/net/protocol.rb:209:in `readline'
    from /usr/share/ruby3.2/net/http/response.rb:158:in `read_status_line'
    from /usr/share/ruby3.2/net/http/response.rb:147:in `read_new'
    from /usr/share/ruby3.2/net/http.rb:1862:in `block in transport_request'
    from /usr/share/ruby3.2/net/http.rb:1853:in `catch'
    from /usr/share/ruby3.2/net/http.rb:1853:in `transport_request'
    from /usr/share/ruby3.2/net/http.rb:1826:in `request'
    from /usr/share/ruby3.2/net/http.rb:1575:in `get'
    from /var/task/vendor/bundle/ruby/3.2.0/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric/lambda_server.rb:28:in `block in next_invocation'
    from /usr/share/ruby3.2/net/http.rb:1238:in `start'
    from /var/task/vendor/bundle/ruby/3.2.0/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric/lambda_server.rb:27:in `next_invocation'
    from /var/task/vendor/bundle/ruby/3.2.0/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric.rb:64:in `wait_for_invocation'
    from /var/task/vendor/bundle/ruby/3.2.0/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric.rb:58:in `start_runtime_loop'
    from /usr/local/share/ruby3.2-gems/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric.rb:42:in `run'
    from /usr/local/share/ruby3.2-gems/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric/bootstrap.rb:35:in `bootstrap_handler'
    from /usr/local/share/ruby3.2-gems/gems/aws_lambda_ric-2.0.0/lib/aws_lambda_ric/bootstrap.rb:8:in `start'
    from /usr/local/share/ruby3.2-gems/gems/aws_lambda_ric-2.0.0/bin/aws_lambda_ric:10:in `<top (required)>'
    from /usr/local/bin/aws_lambda_ric:25:in `load'
    from /usr/local/bin/aws_lambda_ric:25:in `<main>'

Workaround

I've since discovered we can handle this issue in Ruby following the guidance from the graceful shutdown repo (linked above) in the Lambda handler load code like so:

Signal.trap("TERM") do
  puts "Received SIGTERM, shutting down gracefully..."
end

However, this seems like something that might best be handled by the RIC since it seems to be an expected signal and the error and subsequent stack trace is a bit misleading (as it isn't an actual error). I can imagine there could be reasons not to do this (say a lambda runtime that utilizes SIGTERM) and therefore this is more of a request for clarity and direction. I'm more than happy to propose a PR to address or guide discussion.

Additional Evidence