Knotx / knotx-cookbook

Cookbook for automated Knot.x deployment
http://knotx.io
Apache License 2.0
2 stars 5 forks source link

Service (re)start doesn't fail if knot.x config is invalid or missing #2

Open jwadolowski opened 7 years ago

jwadolowski commented 7 years ago

By accident I configured my knot.x instance to use config from wrong directory. chef-client finished successfully, however knot.x instance was not running.

Please find all the details below.

Current state

knot.x log: https://gist.github.com/jwadolowski/3848090e722a9644f41c49e64a94073d

Directory structure

$ tree /opt/knotx/
/opt/knotx/
├── author
│   ├── app
│   │   └── knotx.jar -> /opt/knotx/author/knotx-standalone-1.1.0.fat.jar
│   ├── config
│   │   ├── author.json
│   │   └── publish.json
│   ├── knotx.conf
│   ├── knotx-standalone-1.1.0.fat.jar
│   └── logback.xml
├── file-uploads
└── publish
    ├── app
    │   └── knotx.jar -> /opt/knotx/publish/knotx-standalone-1.1.0.fat.jar
    ├── config
    │   ├── author.json
    │   └── publish.json
    ├── knotx.conf
    ├── knotx-standalone-1.1.0.fat.jar
    └── logback.xml

7 directories, 12 files

knotx.conf file (/opt/knotx/author/config-author.json does not exist)

$ cat author/knotx.conf
#!/bin/bash

# Knotx general attributes

KNOTX_CONFIG="-conf /opt/knotx/author/config-author.json"
KNOTX_CONFIG_EXTRA=""

# Knotx JVM attributes

KNOTX_MIN_HEAP=256
KNOTX_MAX_HEAP=1024
KNOTX_CODE_CACHE=64
KNOTX_PORT=8092
KNOTX_EXTRA_OPTS=

KNOTX_GC_OPTS="\
  -Xloggc:/var/log/knotx/gc-author.log \
  -XX:+PrintGCApplicationStoppedTime \
  -XX:+PrintGCApplicationConcurrentTime \
  -XX:+PrintGC \
  -XX:+PrintGCTimeStamps \
  -XX:+PrintGCDetails \
  -XX:+UseGCLogFileRotation \
  -XX:NumberOfGCLogFiles=10 \
  -XX:GCLogFileSize=5M"

KNOTX_JMX_OPTS="\
  -Dcom.sun.management.jmxremote \
  -Dcom.sun.management.jmxremote.port=18092 \
  -Dcom.sun.management.jmxremote.rmi.port=18092 \
  -Djava.rmi.server.hostname=10.0.2.15 \
  -Dcom.sun.management.jmxremote.ssl=false \
  -Dcom.sun.management.jmxremote.authenticate=false"

Service status

$ systemctl status knotx-author
● knotx-author.service - knotx-author
   Loaded: loaded (/etc/systemd/system/knotx-author.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Tue 2017-07-25 12:07:19 UTC; 5min ago
  Process: 7921 ExecStart=/bin/sh -c /usr/bin/java    -Dlogback.configurationFile=/opt/knotx/author/logback.xml    -Dvertx.cacheDirBase=/opt/knotx/author/.vertx    -Xms${KNOTX_MIN_HEAP}m    -Xmx${KNOTX_MAX_HEAP}m    -XX:ReservedCodeCacheSize=${KNOTX_CODE_CACHE}m    -XX:+UseBiasedLocking    -XX:BiasedLockingStartupDelay=0    -Dio.knotx.KnotxServer.options.config.httpPort=${KNOTX_PORT}    -cp /opt/knotx/author/app/* io.knotx.launcher.LogbackLauncher    ${KNOTX_JMX_OPTS}    ${KNOTX_DEBUG_OPTS}    ${KNOTX_GC_OPTS}    ${KNOTX_EXTRA_OPTS}    ${KNOTX_CONFIG}    ${KNOTX_CONFIG_EXTRA} >> /var/log/knotx/knotx-author.log 2>&1 (code=exited, status=0/SUCCESS)
 Main PID: 7921 (code=exited, status=0/SUCCESS)

Jul 25 12:07:18 default-centos-73-chef-12.vagrantup.com systemd[1]: Started knotx-author.
Jul 25 12:07:18 default-centos-73-chef-12.vagrantup.com systemd[1]: Starting knotx-author...

Expected state

Service shouldn't start if config file is missing (that may be knot.x bug)

jwadolowski commented 7 years ago
[root@default-centos-73-chef-12 ~]# /usr/bin/java    -Dlogback.configurationFile=/opt/knotx/publish/logback.xml    -Dvertx.cacheDirBase=/opt/knotx/publish/.vertx    -Xms256m    -Xmx1024m    -XX:ReservedCodeCacheSize=64m    -XX:+UseBiasedLocking    -XX:BiasedLockingStartupDelay=0    -Dio.knotx.KnotxServer.options.config.httpPort=8093    -cp /opt/knotx/publish/app/* io.knotx.launcher.LogbackLauncher      -Dcom.sun.management.jmxremote   -Dcom.sun.management.jmxremote.port=18093   -Dcom.sun.management.jmxremote.rmi.port=18093   -Djava.rmi.server.hostname=10.0.2.15   -Dcom.sun.management.jmxremote.ssl=false   -Dcom.sun.management.jmxremote.authenticate=false          -Xloggc:/var/log/knotx/gc-publish.log   -XX:+PrintGCApplicationStoppedTime   -XX:+PrintGCApplicationConcurrentTime   -XX:+PrintGC   -XX:+PrintGCTimeStamps   -XX:+PrintGCDetails   -XX:+UseGCLogFileRotation   -XX:NumberOfGCLogFiles=10   -XX:GCLogFileSize=5M        -conf /opt/knotx/publish/config-publish.json > /dev/null 2>&1
[root@default-centos-73-chef-12 ~]# echo $?
0

systemd thinks that service started properly, but it just failed shortly after that, as above command returns 0.

jwadolowski commented 7 years ago

Just raised related knot.x issue: https://github.com/Cognifide/knotx/issues/314