nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.41k stars 332 forks source link

Enhancement: Docker: Restart application(s) after configuration has been applied #718

Open CooperTrooper21 opened 2 years ago

CooperTrooper21 commented 2 years ago

Hi Team, I have been using Nginx Unit for a while now ( it’s awesome :) ) and there one thing that I have come across that would like to enhance.

Correct me if I am wrong but these are the stages of when docker-entrypoint.sh is executed:

  1. discovery process is started
    • If module is present, it will be verbose about it’s presents (python, php, nodejs, etc.)
  2. Controller started
  3. Router started
  4. OpenSSL version shown in logs
  5. NGINX Unit starts
  6. Look for certificate bundles in docker-entrypoint.d/ folder; looking for files with the extension .pem
  7. Look for configuration snippets in docker-entrypoint.d/ folder; looking for files with the extension .json
    • Configuration is applied
    • If successfully applied, application prototype is created and then application(s) (depend on how many processes have been defined) are started
  8. Stops Unit deamon by killing the unit.pid
  9. Waits for the control.unit.socket to be removed to be recreated on startup again

PROBLEM: At this stage, if the application that started in the previous step is consuming the control.unit.socket, this will cause an infinite loop as the docker-entrypoint.sh script will continue to wait for the socket to be freed.

PROPOSAL: Using the process management GET method (https://unit.nginx.org/configuration/#process-management) to restart the application, we can introduce a for loop to go through each application defined in each .json configuration file. this will require jq to be installed:

echo "$0: Restarting each application defined in configuration file(s)."
for f in $(/usr/bin/find /docker-entrypoint.d/ -type f -name "*.json"); do
    for a in $(/usr/bin/jq ".applications | keys[]" ${f} | /usr/bin/tr -d "\""); do
        echo "${0}: Restarting ${a} from configuration file ${f}"
        /usr/bin/curl -s -X GET --unix-socket /var/run/control.unit.sock http://localhost/control/applications/${a}/restart
    done
done
tippexs commented 2 years ago

Hi @CooperTrooper21 Thanks for reporting this us.

I am working currently on a fix for the “ Waiting for control socket to be removed...”-Loop issue.

Your investigations and analysis are great. There are some small things I want to add. Let me light up the dark a little bit more.

The docker-entrypoint.sh scripts purpose is to initially configure Unit at startup. In short to fill the so call state directory of Unit in case it is empty.

So after we initially configured Unit (you are correct, we add certificates and apply all conf files) we want to stop it again to start it normally and load the configuration from the state directory. There is no need to restart the applications at this point in the script.

We saw the Loop-Issue in some cases the application processes are busy with let’s call it something and therefore the socket can not be delete. I will submit a PR in the next days fixing this / your issue.

It would be very great if you would test the fix in your environment as well and share your feedback. 🙏🏼

CooperTrooper21 commented 2 years ago

Hi @tippexs

Awesome! 😁 Let me know when the fix has been applied and I will be happy to test on my end.

Thanks

CooperTrooper21 commented 2 years ago

Hi @tippexs

Hope you are well! 😁

Do you know roughly when the fix will be applied?

Thanks

tippexs commented 2 years ago

Hi @CooperTrooper21 sorry for the late replay on this.

Can you test this fix in the docker-entrypoint.sh script?

59,70c59,61
<             for i in {1..5}; do
<               if [[ -S /var/run/control.unit.sock ]]
<               then
<                 echo "$0 Waiting for control socket to be removed..."
<                 /bin/sleep 1.0
<               else
<                 break
<               fi
<             done
<             if [ -S /var/run/control.unit.sock ]; then
<              kill -SIGTERM `/bin/cat /var/run/unit.pid` && rm -f /var/run/control.unit.sock
<             fi
---
>
>             while [ -S /var/run/control.unit.sock ]; do echo "$0: Waiting for control socket to be removed..."; /bin/sleep 0.1; done
>

Please clone the script for our github repo apply the fix and create a new Dockerimage copying the new file back into your image

COPY docker-entrypoint.sh /usr/local/bin/

We are just about updating the official Docker-Images.

thresheek commented 2 years ago

Please also try the following patch (essentially the same):

diff -r 29b3edfb613d pkg/docker/docker-entrypoint.sh
--- a/pkg/docker/docker-entrypoint.sh   Mon Sep 19 11:59:59 2022 +0100
+++ b/pkg/docker/docker-entrypoint.sh   Mon Sep 19 18:02:35 2022 +0400
@@ -2,6 +2,9 @@

 set -e

+WAITLOOPS=5
+SLEEPSEC=1
+
 curl_put()
 {
     RET=`/usr/bin/curl -s -w '%{http_code}' -X PUT --data-binary @$1 --unix-socket /var/run/control.unit.sock http://localhost/$2`
@@ -57,7 +60,18 @@ if [ "$1" = "unitd" -o "$1" = "unitd-deb
             echo "$0: Stopping Unit daemon after initial configuration..."
             kill -TERM `/bin/cat /var/run/unit.pid`

-            while [ -S /var/run/control.unit.sock ]; do echo "$0: Waiting for control socket to be removed..."; /bin/sleep 0.1; done
+            for i in `/usr/bin/seq $WAITLOOPS`; do
+                if [[ -S /var/run/control.unit.sock ]]; then
+                    echo "$0 Waiting for control socket to be removed..."
+                    /bin/sleep $SLEEPSEC
+                else
+                    break
+                fi
+            done
+            if [ -S /var/run/control.unit.sock ]; then
+                kill -KILL `/bin/cat /var/run/unit.pid`
+                rm -f /var/run/control.unit.sock
+            fi

             echo
             echo "$0: Unit initial configuration complete; ready for start up..."
CooperTrooper21 commented 2 years ago

Thank you both for the help so far! :) @tippexs @thresheek

I have created a dummy app to use as a reference for this issue. This will all be containerised and I will share the files below. I can confirm the app process does stop rather than consuming the socket for eternity, but I am seeing some unusual behaviour now.

Goal: When the app starts, I want to request https://www.google.com every n amount of times every second.

Expected: When configuration is applied, the app will start. Here the app process should be stopped immediately so unit can restart.

Behaviour: When I run the docker container, it will complete its usual routines of starting all the neccessary services for unit as well as searching for configuration inside /docker-entrypoint.d/. After applying the configuration, my app will start, triggering the "on_event" function to start making requests to https://www.google.com.

At this point, whatever amount of time (iterations) I have specified, I will wait and then process will be stopped. In this example, I specified I want the app to request `https://www.google.com' 100 times, every second (100 seconds). From the console output, I can see that it took 125 seconds before the app was stopped. If I change the interation to 10 (10 seconds), it will take approximately 15 seconds before the app is stopped.

Please see the console output below:

/usr/local/bin/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, launching Unit daemon to perform initial configuration...
2022/10/06 14:04:29 [info] 21#21 unit 1.28.0 started
2022/10/06 14:04:29 [info] 28#28 discovery started
2022/10/06 14:04:29 [notice] 28#28 module: python 3.10.7 "/usr/lib/unit/modules/python3.unit.so"
2022/10/06 14:04:29 [info] 24#24 controller started
2022/10/06 14:04:29 [notice] 24#24 process 28 exited with code 0
2022/10/06 14:04:29 [info] 33#33 router started
2022/10/06 14:04:29 [info] 33#33 OpenSSL 1.1.1n  15 Mar 2022, 101010ef
{
        "certificates": {},
        "config": {
                "listeners": {},
                "applications": {}
        },

        "status": {
                "connections": {
                        "accepted": 0,
                        "active": 0,
                        "idle": 0,
                        "closed": 0
                },

                "requests": {
                        "total": 0
                },

                "applications": {}
        }
}
/usr/local/bin/docker-entrypoint.sh: Looking for certificate bundles in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Looking for configuration snippets in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Applying configuration /docker-entrypoint.d/config.json
2022/10/06 14:04:29 [info] 44#44 "webapp" prototype started
2022/10/06 14:04:29 [info] 46#46 "webapp" application started
/usr/local/bin/docker-entrypoint.sh: OK: HTTP response status code is '200'
{
        "success": "Reconfiguration done."
}

/usr/local/bin/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Stopping Unit daemon after initial configuration...
2022/10/06 14:06:34 [notice] 24#24 process 31 exited with code 0
2022/10/06 14:06:34 [notice] 24#24 process 33 exited with code 0
/usr/local/bin/docker-entrypoint.sh Waiting for control socket to be removed...
2022/10/06 14:06:34 [notice] 44#44 app process 46 exited with code 0
2022/10/06 14:06:34 [alert] 44#44 sendmsg(13, -1, -1, 2) failed (32: Broken pipe)
2022/10/06 14:06:34 [notice] 24#24 process 44 exited with code 0

/usr/local/bin/docker-entrypoint.sh: Unit initial configuration complete; ready for start up...

2022/10/06 14:06:35 [info] 1#1 unit 1.28.0 started
2022/10/06 14:06:35 [info] 75#75 discovery started
2022/10/06 14:06:35 [notice] 75#75 module: python 3.10.7 "/usr/lib/unit/modules/python3.unit.so"
2022/10/06 14:06:35 [info] 1#1 controller started
2022/10/06 14:06:35 [notice] 1#1 process 75 exited with code 0
2022/10/06 14:06:35 [info] 79#79 router started
2022/10/06 14:06:35 [info] 79#79 OpenSSL 1.1.1n  15 Mar 2022, 101010ef
2022/10/06 14:06:35 [info] 81#81 "webapp" prototype started
2022/10/06 14:06:35 [info] 83#83 "webapp" application started

I have attached all the files below. This includes the second fix provided in this thread by @thresheek. I have tried with the first fix provided by @tippexs and I was seeing the exact same behaviour: fastapi_nginx_unit.tar.gz

CooperTrooper21 commented 1 year ago

Hi @tippexs

Just wanted to confirm, this issue has been resolved? If so, can close off this issue.

Thanks in advance, Reece