Closed DocCyblade closed 7 years ago
Might have found the issue
So this is what I get when I try and start the service
root@tkl-mayan-edms /etc/init.d# service php-fastcgi stop
root@tkl-mayan-edms /etc/init.d# service php-fastcgi start
Job for php-fastcgi.service failed. See 'systemctl status php-fastcgi.service' and 'journalctl -xn' for details.
root@tkl-mayan-edms /etc/init.d# systemctl status php-fastcgi.service
* php-fastcgi.service - LSB: Start and stop php-cgi in external FASTCGI mode
Loaded: loaded (/etc/init.d/php-fastcgi)
Active: failed (Result: exit-code) since Wed 2016-10-12 17:28:36 UTC; 7s ago
Process: 9650 ExecStart=/etc/init.d/php-fastcgi start (code=exited, status=1/FAILURE)
Oct 12 17:28:36 tkl-mayan-edms systemd[1]: php-fastcgi.service: control process exited, code=exited status=1
Oct 12 17:28:36 tkl-mayan-edms systemd[1]: Failed to start LSB: Start and stop php-cgi in external FASTCGI mode.
Oct 12 17:28:36 tkl-mayan-edms systemd[1]: Unit php-fastcgi.service entered failed state.
See the line Process: 9650 ExecStart=/etc/init.d/php-fastcgi start (code=exited, status=1/FAILURE)
So I had no idea why, systemd is not vey fun to troubleshoot. However I added set -ex
to it and tried again,
root@tkl-mayan-edms /etc/init.d# service php-fastcgi stop
root@tkl-mayan-edms /etc/init.d# service php-fastcgi start
Job for php-fastcgi.service failed. See 'systemctl status php-fastcgi.service' and 'journalctl -xn' for details.
root@tkl-mayan-edms /etc/init.d# systemctl status php-fastcgi.service
* php-fastcgi.service - LSB: Start and stop php-cgi in external FASTCGI mode
Loaded: loaded (/etc/init.d/php-fastcgi)
Active: failed (Result: exit-code) since Wed 2016-10-12 17:31:35 UTC; 36s ago
Process: 9782 ExecStart=/etc/init.d/php-fastcgi start (code=exited, status=1/FAILURE)
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + [ yes != yes -a start != stop ]
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + export PHP_FCGI_CHILDREN PHP_FCGI_MAX_REQUESTS
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + DAEMON_ARGS=-q -b /var/run/nginx/php-fastcgi.sock
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + [ no != no ]
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + do_start
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + start-stop-daemon --start --quiet --pidfile /var/run/php-fastcgi.pid ...-test
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + return 1
Oct 12 17:31:35 tkl-mayan-edms systemd[1]: php-fastcgi.service: control process exited, code=exited status=1
Oct 12 17:31:35 tkl-mayan-edms systemd[1]: Failed to start LSB: Start and stop php-cgi in external FASTCGI mode.
Oct 12 17:31:35 tkl-mayan-edms systemd[1]: Unit php-fastcgi.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.
Now it points out the offending line:
Oct 12 17:31:35 tkl-mayan-edms php-fastcgi[9782]: + start-stop-daemon --start --quiet --pidfile /var/run/php-fastcgi.pid ...-test
https://github.com/DocCyblade/tkl-mayan-edms/blob/dev/overlay/etc/init.d/php-fastcgi#L55
The issue is the testing line. I under stand the code to detect if it's running but it seems systemd is looking at this as an error. Might need to re-write this to just try and start it not worry about sending pretty messages. The only concern is apps on the hub and LXC use systemV and would need to run the script.
Well, after more testing I think I have pin pointed it to below https://github.com/DocCyblade/tkl-mayan-edms/blob/dev/overlay/etc/init.d/php-fastcgi#L91
case "$1" in
start)
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
do_start
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
Removing the [ "$VERBOSE" != no ] &&
case "$1" in
start)
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
do_start
case "$?" in
0|1) log_end_msg 0 ;;
2) log_end_msg 1 ;;
esac
;;
When we try and start it
root@tkl-mayan-edms /etc/init.d# service php-fastcgi start
root@tkl-mayan-edms /etc/init.d# systemctl status php-fastcgi.service
* php-fastcgi.service - LSB: Start and stop php-cgi in external FASTCGI mode
Loaded: loaded (/etc/init.d/php-fastcgi)
Active: active (running) since Wed 2016-10-12 18:24:23 UTC; 3s ago
Process: 11024 ExecStart=/etc/init.d/php-fastcgi start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/php-fastcgi.service
|-11032 /usr/bin/php-cgi -q -b /var/run/nginx/php-fastcgi.sock
|-11040 /usr/bin/php-cgi -q -b /var/run/nginx/php-fastcgi.sock
`-11041 /usr/bin/php-cgi -q -b /var/run/nginx/php-fastcgi.sock
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + RED=
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + YELLOW=
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + NORMAL=
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + [ 0 -eq 0 ]
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + echo .
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: .
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + log_end_msg_post 0
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + :
Oct 12 18:24:23 tkl-mayan-edms php-fastcgi[11024]: + return 0
Oct 12 18:24:23 tkl-mayan-edms systemd[1]: Started LSB: Start and stop php-cgi in external FASTCGI mode.
The biggest question is WHY? IF I run this as a regular script, it works fine, systemd does not like it.
@JedMeister - So looking at the above post, can you try and explain what is going on? I can get it to work by removing [ "$VERBOSE" != no ] &&
Question is WHY, and will that affect LXC and hub installations that use systemV?
Ok well I've spent a little time looking at this and I honestly don't understand why (I haven't actually done any real world testing). And TBH it actually concerns me a little as this should be the exact code that the Nginx and Mattermost appliances are already using. And in my testing (albeit some time ago) everything "just works".
It's almost like it's acting like -e
has been set in the initscript (i.e. exit 1
when it hits line 91). Although if that's the case then it should actually be returning 1
(erroring) at line 88 when [ "$VERBOSE" != no ] &&
is first used.
Hang on... I've just noticed that you have removed the last 3 lines from the file. Initially I didn't think it would make any difference (it's just a colon with a blank line either side) however I've just done some quick reading and discovered that :
is a POSIX shell built-in alias for true
.
Now I would expect that that still shouldn't really matter, but perhaps for some reason it does? Especially considering that that's the only difference between the files in Nginx and Mattermost appliances and last I checked both of those "just work"!? Perhaps add those lines back and see what happens?
Ahhhhh I did copy and paste maybe I missed something. I'll try it out.
@JedMeister - I thought I had stumped you! Guess not!
So after reading this question about the colon in scripts and this one about $? I finally understand the script as a whole.
The script assumes by the colon at the end that unless we manually call out an exit code other than 0/true it should exit as as 0/true because the placement. Because it was not there, the last commands was false or 1 so the systemd picked that up instead of the colon (true/0). As I read the Q/A it seems it makes sure that the correct return/exit value is set, probably due to the calls to log_end_msg
I am building it now... let you know how it works, I have a feeling this mystery has been solved. I have learned a great deal, so the head pounding and time was definitely not wasted!
@JedMeister - And you were right as rain! 👍
I think I am happy to call this beta ready. I'll kick the tires and shake out the bugs and fine tune things.
Awesome! :smile:
Issue is when trying to start the service it fails with no reason, but it does start.