openfaas / templates

OpenFaaS Classic templates
https://www.openfaas.com
MIT License
278 stars 227 forks source link

add graceful shutdown for node18 functions #306

Open NikhilSharmaWe opened 1 year ago

NikhilSharmaWe commented 1 year ago

Description

This PR adds graceful shutdown for template node18 functions.

Motivation and Context

Which issue(s) this PR fixes

Fixes #305

How Has This Been Tested?

  1. First create a function for the update node18 template.
  2. After the function is created, delete the function pod running with signal SIGTERM, like : kubectl -n openfaas-fn exec testdrainfunc-b44d49bb6-9kzzh -- kill -s SIGTERM 1.
  3. The logs shows the added logic is being implemented :
    2023-05-26T09:18:38Z 2023/05/26 09:18:38 SIGTERM: no new connections in 15s
    2023-05-26T09:18:38Z 2023/05/26 09:18:38 Removing lock-file : /tmp/.lock
    2023-05-26T09:18:38Z Function got SIGTERM event, draining up to: 15s
    2023-05-26T09:18:38Z Server gracefully shut down
    2023-05-26T09:18:53Z 2023/05/26 09:18:53 No new connections allowed, draining: 0 requests
    2023-05-26T09:18:53Z 2023/05/26 09:18:53 Exiting. Active connections: 0

Types of changes

Impact to existing users

There will be no significant impact on how users use openfaas functions.

Checklist:

alexellis commented 1 year ago

This looks very similar to what I was excepting. I think you'll need to test it and also need to wait for the health check duration before calling close on the server. See how we do that in the Go template.

alexellis commented 1 year ago

I've sent you a trial license for OpenFaaS Pro/Standard.

Would you like to try testing it with a 10min shutdown time?

Installation -> https://docs.openfaas.com/deployment/pro/

Testing with long timeouts -> https://www.openfaas.com/blog/long-running-jobs/

I simulate it like this:

Deploy function... invoke it and watch the logs. (with a 10m sleep)

Then I update the code and image tag, and redeploy it.

If it's all working right, the Pod should go to Terminating but stay around until the invocation has completed successfully.

Here's my test function for Go/Python/Node for a longer timeout if you need it - https://github.com/alexellis/go-long

NikhilSharmaWe commented 1 year ago

@alexellis

I test the update node18 template for

environment:
  write_timeout: 10m2s
  healthcheck_interval: 5s

Then the logs for the node18 func after kubectl scale -n openfaas-fn deploy/node18 --replicas=0 are:

2023-05-27T19:03:50Z 2023/05/27 19:03:50 SIGTERM: no new connections in 5s
2023-05-27T19:03:50Z 2023/05/27 19:03:50 Removing lock-file : /tmp/.lock
2023-05-27T19:03:50Z Function got SIGTERM event, draining up to: 10m2s
2023-05-27T19:03:50Z Server gracefully shut down
2023-05-27T19:03:55Z 2023/05/27 19:03:55 No new connections allowed, draining: 0 requests
2023-05-27T19:03:55Z 2023/05/27 19:03:55 Exiting. Active connections: 0
alexellis commented 1 year ago

Thanks for working on this and for testing the change. What I'd like to see is a curl statement - followed by you scaling to zero replicas. Show that "time curl ..." completes despite you scaling down. You'll need the Pro/Standard license that I sent you separately.