Closed revengel closed 3 months ago
Could you please provide the full configuration (angie -T
)?
Could you please provide the full configuration (
angie -T
)?
Yes It's here https://gist.github.com/revengel/584c109556c3448f9cdb542f46c03524
Could you please provide the full configuration (
angie -T
)?Yes It's here https://gist.github.com/revengel/584c109556c3448f9cdb542f46c03524
Thanks. I see nothing suspicious... Could you also enabled debug logging and collect one with an error?
debug logs here
2024/07/01 23:29:04 [alert] 1#1: worker process 10 exited on signal 11 (core dumped)
2024/07/01 23:29:04 [notice] 1#1: start worker process 11
2024/07/01 23:29:04 [alert] 11#11: setpriority(-5) failed (13: Permission denied)
2024/07/01 23:29:04 [debug] 11#11: epoll add event: fd:43 op:1 ev:00002001
2024/07/01 23:29:04 [debug] 11#11: epoll add event: fd:44 op:1 ev:00002001
2024/07/01 23:29:04 [debug] 11#11: acme status wgui_example_com: certificate scheduled for renewal on Wed Jul 3 00:16:48 2024
2024/07/01 23:29:04 [debug] 11#11: accept on 0.0.0.0:443, ready: 0
2024/07/01 23:29:04 [debug] 11#11: posix_memalign: 00007F76038E3800:512 @16
2024/07/01 23:29:04 [debug] 11#11: *7 accept: 172.18.0.1:49498 fd:3
2024/07/01 23:29:04 [debug] 11#11: *7 event timer add: 3: 60000:17078525
2024/07/01 23:29:04 [debug] 11#11: *7 reusable connection: 1
2024/07/01 23:29:04 [debug] 11#11: *7 epoll add event: fd:3 op:1 ev:80002001
2024/07/01 23:29:04 [debug] 11#11: *7 http check ssl handshake
2024/07/01 23:29:04 [debug] 11#11: *7 http recv(): 1
2024/07/01 23:29:04 [debug] 11#11: *7 https ssl handshake: 0x16
2024/07/01 23:29:04 [debug] 11#11: *7 tcp_nodelay
2024/07/01 23:29:04 [debug] 11#11: *7 reusable connection: 0
2024/07/01 23:29:04 [debug] 11#11: *7 SSL server name: "dav.example.com"
2024/07/01 23:29:04 [debug] 11#11: *7 posix_memalign: 00007F7603866320:4096 @16
2024/07/01 23:29:04 [debug] 11#11: *7 posix_memalign: 00007F7603867560:4096 @16
2024/07/01 23:29:04 [notice] 1#1: signal 17 (SIGCHLD) received from 11
2024/07/01 23:29:04 [alert] 1#1: worker process 11 exited on signal 11 (core dumped)
2024/07/01 23:29:04 [notice] 1#1: start worker process 12
2024/07/01 23:29:04 [alert] 12#12: setpriority(-5) failed (13: Permission denied)
2024/07/01 23:29:04 [debug] 12#12: epoll add event: fd:43 op:1 ev:00002001
2024/07/01 23:29:04 [debug] 12#12: epoll add event: fd:44 op:1 ev:00002001
2024/07/01 23:29:04 [debug] 12#12: acme status wgui_example_com: certificate scheduled for renewal on Wed Jul 3 00:16:48 2024
2024/07/01 23:29:04 [debug] 12#12: accept on 0.0.0.0:443, ready: 0
2024/07/01 23:29:04 [debug] 12#12: posix_memalign: 00007F76038E3800:512 @16
2024/07/01 23:29:04 [debug] 12#12: *8 accept: 172.18.0.1:49502 fd:3
2024/07/01 23:29:04 [debug] 12#12: *8 event timer add: 3: 60000:17078677
2024/07/01 23:29:04 [debug] 12#12: *8 reusable connection: 1
2024/07/01 23:29:04 [debug] 12#12: *8 epoll add event: fd:3 op:1 ev:80002001
2024/07/01 23:29:04 [debug] 12#12: *8 http check ssl handshake
2024/07/01 23:29:04 [debug] 12#12: *8 http recv(): 1
2024/07/01 23:29:04 [debug] 12#12: *8 https ssl handshake: 0x16
2024/07/01 23:29:04 [debug] 12#12: *8 tcp_nodelay
2024/07/01 23:29:04 [debug] 12#12: *8 reusable connection: 0
2024/07/01 23:29:04 [debug] 12#12: *8 SSL server name: "caldav.example.com"
2024/07/01 23:29:04 [debug] 12#12: *8 posix_memalign: 00007F7603866320:4096 @16
2024/07/01 23:29:04 [debug] 12#12: *8 posix_memalign: 00007F7603867560:4096 @16
2024/07/01 23:29:04 [notice] 1#1: signal 17 (SIGCHLD) received from 12
Is it possible to get a core dump from that container?
Do you need any assistance to extract a core dump? You can write directly to me in Telegram @VBart or via an email vbart@wbsrv.ru. We are very curious in debugging this issue, but unfortunately neither config, nor debug log gives any hits here.
Do you need any assistance to extract a core dump? You can write directly to me in Telegram @VBart or via an email vbart@wbsrv.ru. We are very curious in debugging this issue, but unfortunately neither config, nor debug log gives any hits here.
Yes Please help collect these data (core dump) I run it in docker via docker compose tool Can you please provide simple instruction?
I've tried to reproduce the error using your configuration on our Angie 1.6.0 docker image, but to no avail.
I don't know how you are using docker compose, so I can't give you precise instructions on extracting the core file. The idea is to mount a local directory into your container (e.g. using -v $(pwd):/shared
) and run docker in interactive mode with a terminal (using -it
). Then you should start Angie manually (e.g. # angie -g 'daemon off;'
), and when you get a core dump, you can just copy it to your local directory (e.g. # cp core.9 /shared
). With our Alpine docker image, core dumps are just created in the current directory and I didn't even have to configure that.
_usr_sbin_angie-nodebug.33.crash[1].gz
I had the same problem... Downgraded back to 1.5.2 !
On v 1.6 every worker process had "core dumped" : (
Ubuntu 22.04.4 LTS, / i9-12900K
Hi @gun4A, Thanks a lot for your input. The source of the bug has been identified, we will fix it in the next release.
@gun4A, here's a patch that fixes this, in case you wish to test. fix_acme_crashes.patch.gz
Could you please provide the full configuration (
angie -T
)?Yes It's here https://gist.github.com/revengel/584c109556c3448f9cdb542f46c03524
The issue fixed by the patch above will manifest itself only with more than 4 acme_client
directives configured.
But in the config provided by this link there is only one acme_client
directive.
@revengel are you sure that you've provided the full configuration previously?
@VBart There are more than 4 acme_client directives in my config. The config in the gist is really not quite complete. I cut out the excess for the sake of compactness.
@gun4A, here's a patch that fixes this, in case you wish to test. fix_acme_crashes.patch.gz
patching file ngx_http_acme_module.c Hunk #4 succeeded at 4253 (offset -4 lines). Hunk #5 succeeded at 4508 (offset -4 lines). Hunk #6 succeeded at 5157 (offset -8 lines). Hunk #7 succeeded at 5169 (offset -8 lines). Hunk #8 FAILED at 5188. Hunk #9 succeeded at 5212 (offset -9 lines). 1 out of 9 hunks FAILED -- saving rejects to file ngx_http_acme_module.c.rej
--- ngx_http_acme_module.c Mon Jul 29 11:53:51 2024 +0300 +++ ngx_http_acme_module.c Mon Jul 29 14:29:23 2024 +0300 @@ -5188,13 +5188,11 @@ return cli; }
- cli = ngx_array_push(&amcf->clients); + cli = ngx_pcalloc(cf->pool, sizeof(ngx_acme_client_t)); if (cli == NULL) { return NULL; }
- ngx_memzero(cli, sizeof(ngx_acme_client_t)); - cli->log = cf->log; cli->name = *name; cli->enabled = NGX_CONF_UNSET_UINT;
@gun4A, here's a patch that fixes this, in case you wish to test. fix_acme_crashes.patch.gz
patching file ngx_http_acme_module.c Hunk #4 succeeded at 4253 (offset -4 lines). Hunk #5 succeeded at 4508 (offset -4 lines). Hunk #6 succeeded at 5157 (offset -8 lines). Hunk #7 succeeded at 5169 (offset -8 lines). Hunk #8 FAILED at 5188. Hunk #9 succeeded at 5212 (offset -9 lines). 1 out of 9 hunks FAILED -- saving rejects to file ngx_http_acme_module.c.rej
Contents of ngx_http_acme_module.c.rej:
--- ngx_http_acme_module.c Mon Jul 29 11:53:51 2024 +0300 +++ ngx_http_acme_module.c Mon Jul 29 14:29:23 2024 +0300 @@ -5188,13 +5188,11 @@ return cli; }
cli = ngx_array_push(&amcf->clients); + cli = ngx_pcalloc(cf->pool, sizeof(ngx_acme_client_t)); if (cli == NULL) { return NULL; }
ngx_memzero(cli, sizeof(ngx_acme_client_t)); - cli->log = cf->log; cli->name = *name; cli->enabled = NGX_CONF_UNSET_UINT;
The patch was against the latest revision at the moment. Here's a version of the patch against 1.6.0 release: fix_acme_crashes_v1.6.0.patch.gz
Arrgh, sorry for the wrong patch. I forgot that people aren't using our latest version :smile:
Fixed by https://github.com/webserver-llc/angie/commit/cfd01492f3db4a349cff4d703bdc7439d15bc2df (Angie 1.6.1).
using acme module for renew certificates
running in alpine-based container
on version 1.5.2 - works well
after upgrading to version 1.6.0 - a cyclic error occurs
help me figure out the problem