nginx-shib / nginx-http-shibboleth

Shibboleth auth request module for nginx
https://github.com/nginx-shib/nginx-http-shibboleth/wiki
Other
209 stars 27 forks source link

shibauthorizer and shibresponder cannot be started by supervisord #28

Closed chankongching closed 6 years ago

chankongching commented 6 years ago

Description of Issue/Question

Trying to install shibboleth FastCGI support and after recompiling shibboleth.x86_64 2.6.0-2.2.el6, shibresponder and shibauthorizer exists.

[root@ip-10-0-0-33 shibboleth]# ls -lrt /usr/lib64/shibboleth/
total 912
-rwxr-xr-x 1 root root  41288 Oct  1 18:17 shibresponder
-rwxr-xr-x 1 root root  48856 Oct  1 18:17 shibauthorizer
-rwxr-xr-x 1 root root 240808 Oct  1 18:17 plugins.so
-rwxr-xr-x 1 root root 104112 Oct  1 18:17 plugins-lite.so
-rwxr-xr-x 1 root root  83096 Oct  1 18:17 odbc-store.so
-rwxr-xr-x 1 root root 139472 Oct  1 18:17 mod_shib_22.so
-rwxr-xr-x 1 root root 159304 Oct  1 18:17 adfs.so
-rwxr-xr-x 1 root root  99120 Oct  1 18:17 adfs-lite.so

Setup

./shibboleth-rebuild.sh
rpm -ivh rpmbuild/RPMS/x86_64/shibboleth-2.6.0-2.2.el6.x86_64.rpm

Steps to Reproduce Issue

on supervisord log, it is just telling me the process is exited

2017-10-03 09:35:13,064 INFO spawned: 'shibauthorizer' with pid 8454
2017-10-03 09:35:13,156 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:13,156 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:14,158 INFO spawned: 'shibauthorizer' with pid 8457
2017-10-03 09:35:14,238 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:14,239 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:16,242 INFO spawned: 'shibauthorizer' with pid 8460
2017-10-03 09:35:16,323 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:16,323 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:19,328 INFO spawned: 'shibauthorizer' with pid 8463
2017-10-03 09:35:19,408 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:19,408 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:20,409 INFO gave up: shibauthorizer entered FATAL state, too many start retries too quickly
2017-10-03 09:36:59,807 INFO localhost:0 - - [03/Oct/2017:09:36:59 +0000] "POST /RPC2 HTTP/1.0" 200 254
2017-10-03 09:36:59,809 INFO spawned: 'shibauthorizer' with pid 8495
2017-10-03 09:36:59,901 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:36:59,901 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:00,903 INFO spawned: 'shibauthorizer' with pid 8498
2017-10-03 09:37:00,986 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:00,987 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:02,990 INFO spawned: 'shibauthorizer' with pid 8501
2017-10-03 09:37:03,074 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:03,074 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:06,079 INFO spawned: 'shibauthorizer' with pid 8504
2017-10-03 09:37:06,165 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:06,165 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:07,167 INFO gave up: shibauthorizer entered FATAL state, too many start retries too quickly

Versions and Systems

(nginx -V, shibd -v (and compile options), OS type and version)

nginx version: nginx/1.12.1
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) 
built with OpenSSL 1.0.1e-fips 11 Feb 2013
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'
shibd -v
shibboleth 2.6.0
cat /etc/redhat-release 
CentOS release 6.9 (Final)
chankongching commented 6 years ago

here is my supervisord.conf

[unix_http_server]
file=/var/run/supervisord/supervisor.sock   ; UNIX socket 文件,supervisorctl 会使用
;chmod=0700                 ; socket 文件的 mode,默认是 0700
;chown=nobody:nogroup       ; socket 文件的 owner,格式: uid:gid

;[inet_http_server]         ; HTTP 服务器,提供 web 管理界面
;port=127.0.0.1:9001        ; Web 管理后台运行的 IP 和端口,如果开放到公网,需要注意安全性
;username=user              ; 登录管理后台的用户名
;password=123               ; 登录管理后台的密码

[supervisord]
http_port=/var/run/supervisord/supervisor.sock
logfile=/var/log/supervisor/supervisord.log ; 日志文件,默认是 $CWD/supervisord.log
logfile_maxbytes=50MB        ; 日志文件大小,超出会 rotate,默认 50MB
logfile_backups=10           ; 日志文件保留备份数量默认 10
loglevel=info                ; 日志级别,默认 info,其它: debug,warn,trace
pidfile=/var/run/supervisord/supervisord.pid ; pid 文件
nodaemon=false               ; 是否在前台启动,默认是 false,即以 daemon 的方式启动
minfds=1024                  ; 可以打开的文件描述符的最小值,默认 1024
minprocs=200                 ; 可以打开的进程数的最小值,默认 200

; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///var/run/supervisord/supervisor.sock ; 通过 UNIX socket 连接 supervisord,路径与 unix_http_server 部分的 file 一致
;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord

; 包含其他的配置文件
;[include]
;files = /etc/supervisord/conf.d/shib.conf 
;files = /etc/supervisord/conf.d/*.conf 

[program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
socket=unix:///opt/shibboleth/shibauthorizer.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibauthorizer.log
stderr_logfile=/var/log/supervisor/shibauthorizer.error.log

[program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder
socket=unix:///opt/shibboleth/shibresponder.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibresponder.log
stderr_logfile=/var/log/supervisor/shibresponder.error.log
davidjb commented 6 years ago

Your config needs to tell Supervisor you're running FastCGI applications, eg:

[fcgi-program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
...
[fcgi-program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder

rather than the [program:] sections you were using, where Supervisor is trying to run the applications normally. Docs at http://supervisord.org/configuration.html#fcgi-program-x-section-settings.

Also, I'd recommend not running shibauthorizer/shibresponder as root and you can reduce the permissions on its socket to 0660 and socket owned by a corresponding user/group. I kept my applications and sockets run as / owned by the shibd user, to match what the shibd process does. This way, in case any shared files (eg logs), there won't be any permission issues if the FastCGI apps create/modify any of permissions/ownership on these files.

Also, in future this type of issue/question is best reported at https://github.com/nginx-shib/shibboleth-fastcgi as it's related to the FastCGI applications. Thanks!

chankongching commented 6 years ago

i have updated my supervisord.conf to the following, as u mentioned, chaning to fcgi-program but i still cannot make the socket appears. What log should i check to get further?

[unix_http_server]
file=/var/run/supervisord/supervisor.sock   ; UNIX socket 文件,supervisorctl 会使用
;chmod=0700                 ; socket 文件的 mode,默认是 0700
;chown=nobody:nogroup       ; socket 文件的 owner,格式: uid:gid

;[inet_http_server]         ; HTTP 服务器,提供 web 管理界面
;port=127.0.0.1:9001        ; Web 管理后台运行的 IP 和端口,如果开放到公网,需要注意安全性
;username=user              ; 登录管理后台的用户名
;password=123               ; 登录管理后台的密码

[supervisord]
http_port=/var/run/supervisord/supervisor.sock
logfile=/var/log/supervisor/supervisord.log ; 日志文件,默认是 $CWD/supervisord.log
logfile_maxbytes=50MB        ; 日志文件大小,超出会 rotate,默认 50MB
logfile_backups=10           ; 日志文件保留备份数量默认 10
loglevel=info                ; 日志级别,默认 info,其它: debug,warn,trace
pidfile=/var/run/supervisord/supervisord.pid ; pid 文件
nodaemon=false               ; 是否在前台启动,默认是 false,即以 daemon 的方式启动
minfds=1024                  ; 可以打开的文件描述符的最小值,默认 1024
minprocs=200                 ; 可以打开的进程数的最小值,默认 200

; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///var/run/supervisord/supervisor.sock ; 通过 UNIX socket 连接 supervisord,路径与 unix_http_server 部分的 file 一致
;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord

; 包含其他的配置文件
;[include]
;files = /etc/supervisord/conf.d/shib.conf
;files = /etc/supervisord/conf.d/*.conf

[fcgi-program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
socket=unix:///opt/shibboleth/shibauthorizer.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibauthorizer.log
stderr_logfile=/var/log/supervisor/shibauthorizer.error.log

[fcgi-program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder
socket=unix:///opt/shibboleth/shibresponder.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibresponder.log
stderr_logfile=/var/log/supervisor/shibresponder.error.log

Cheers and Thank you very much for your help

davidjb commented 6 years ago

What does supervisorctl report regarding the status of the processes or else what do the supervisord logs say? Has supervisorctl's config been updated since the change and if so, are the processes started and running or do they still fail? The logs (eg the locations you have configured for stdout_logfile and stderr_logfile) will explain what's happening. Your config looks fine otherwise now (save the caveat above about permissions, but that won't stop the programs running).

You can also test the shibauthorizer and shibresponder applications by running them directly. They'll end immediately (exit code 0) but output should look like this for both:

Shibboleth initialization complete. Starting request loop.
Request loop ended.

If any errors or other output occurs, I'd start investigating that too, which could be related to compilation/packaging of Shibboleth itself.

chankongching commented 6 years ago

here is my response when i am running them directly /usr/lib64/shibboleth/shibresponder Shibboleth initialization complete. Starting request loop. Request loop ended. /usr/lib64/shibboleth/shibresponder Shibboleth initialization complete. Starting request loop. Request loop ended. All other logs are empty cd /var/log/supervisor/ ls -lrt total 32 -rwxrwxrwx 1 root root 0 Oct 3 07:37 shibauthorizer.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibauthorizer.error.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.error.log -rwxrwxrwx 1 root root 27095 Oct 5 13:42 supervisord.log I have flushed my logs and restarted supervisord, but all other logs are always empty ls -lrt total 4 -rwxrwxrwx 1 root root 0 Oct 3 07:37 shibauthorizer.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibauthorizer.error.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.error.log -rwxrwxrwx 1 root root 541 Oct 7 00:19 supervisord.log cat supervisord.log 2017-10-07 00:19:30,052 CRIT received SIGTERM indicating exit request 2017-10-07 00:19:30,240 CRIT Supervisor running as root (no user in config file) 2017-10-07 00:19:30,254 INFO /var/run/supervisord/supervisor.sock:Medusa (V1.1.1.1) started at Sat Oct 7 00:19:30 2017 Hostname: <unix domain socket> Port:/var/run/supervisord/supervisor.sock 2017-10-07 00:19:30,320 CRIT Running without any HTTP authentication checking 2017-10-07 00:19:30,322 INFO daemonizing the process 2017-10-07 00:19:30,322 INFO supervisord started with pid 21726 And the socket didnt exists ls -lrt /opt/shibboleth/shibauthorizer.sock ls: cannot access /opt/shibboleth/shibauthorizer.sock: No such file or directory ls -lrt /opt/shibboleth/shibresponder.sock ls: cannot access /opt/shibboleth/shibresponder.sock: No such file or directory

Seems my supervisord doesnt respect the fcgi-program as the program name, that was why i changed to "program"

here is my supervisord version [root@ip-10-0-0-33 supervisor]# yum list installed | grep super supervisor.noarch 2.1-9.el6 @epel

But the supervisord shld have fcgi-program support since very old version(article from 2010) ref: http://sachachua.com/blog/2010/06/using-supervisord-for-nginxfastcgiphp/

no clue yet...

davidjb commented 6 years ago

Ah, there's your problem. Your Supervisor version is much too old. See http://supervisord.org/configuration.html#fcgi-program-x-section-settings, but the socket and other options need Supervisord v3.0 or above.

Install that version and you'll be good to go. Unfortunately the version in EPEL is ancient so you'll have to build the RPMs yourself. You might consult https://github.com/jcu-eresearch/supervisor-rpm as a starting point (also in need of an update, but it's v3+). Otherwise, if upgrading OS is an option, Supervsiord 3.1.4 is packaged for EL7.

davidjb commented 6 years ago

I've also added a note to this effect at https://github.com/nginx-shib/nginx-http-shibboleth/commit/b55eeddf4f14a8f07c716003519e9f9bb80e38dd. Closing as resolved; message back otherwise.

chankongching commented 6 years ago

Yes it is running now! i cannot imagine it is resolved so easily, sorry for your time