bstopp / puppet-aem

Puppet module for managing AEM Installations.
https://forge.puppet.com/bstopp/aem
Apache License 2.0
30 stars 30 forks source link

Add start opts support to AEM instance #121

Closed cliffano closed 5 years ago

cliffano commented 6 years ago

This PR add the ability to specify start opts (https://helpx.adobe.com/experience-manager/6-3/sites/deploying/using/custom-standalone-install.html#FurtheroptionsavailablefromtheQuickstartfile , e.g. -nofork) to AEM instance, consistent to JVM opts and JVM mem opts.

bstopp commented 5 years ago

Those start options are only valid when starting the jar directly. Since this unpacks the jar, the only start options are those here:

usage: org.apache.sling.launchpad.app.Main [ start | stop | status ] [ -j adr ] [ -l loglevel ] [ -f logfile ] [ -c slinghome ] [ -i launchpadhome ] [ -a address ] [ -p port ] { -Dn=v } [ -h ]
    start         listen for control connection (uses -j)
    stop          terminate running Apache Sling (uses -j)
    status        check whether Apache Sling is running (uses -j)
    threads       request a thread dump from Apache Sling (uses -j)
    -j adr        host and port to use for control connection in the format '[host:]port' (default 127.0.0.1:0)
    -l loglevel   the initial loglevel (0..4, FATAL, ERROR, WARN, INFO, DEBUG)
    -f logfile    the log file, "-" for stdout (default logs/error.log)
    -c slinghome  the sling context directory (default sling)
    -i launchpadhome  the launchpad directory (default slinghome)
    -a address    the interfact to bind to (use 0.0.0.0 for any)
    -p port       the port to listen to (default 8080)
    -r path       the root servlet context path for the http service (default is /)
    -n            don't install the shutdown hook
    -Dn=v         sets property n to value v. Make sure to use this option *after* the jar filename. The JVM also has a -D option which has a different meaning
    -h            prints this usage message

So things like nofork aren't really an option as the start script forks the process.

cliffano commented 5 years ago

I'm going to add a note on the problem that we encountered and our observation after adding the -nofork start opt, with the hope that others who hit similar scenario might find it useful in their troubleshooting.

We hit a production problem back when we were using AEM 6.2. During unexpected event which resulted in much additional load ended up crashing AEM publish no matter how much memory we kept adding to scale up as a temporary workaround. Later on we noticed that those memory ended up with the parent process, while the forks could only be allocated 1Gb.

After adding -nofork, we observed that the forks were no longer created, so all load was handled by the parent process and it easily handled all load. Since then, we've added the load profile to our performance testing regime, with -nofork added.

The above occurred 2 years or so ago, so details are sketchy by now. @bstopp My understanding from your latest comment is that the -nofork start opt that was added back then could've worked perhaps because AEM was started via the jar file directly.

Could you please elaborate what do you mean by 'unpacks the jar'? Another question, you mentioned that 'nofork isn't an option' because the start script from this module forks the process anyway, what would be your suggestion on how to solve the problem we encountered two years ago? i.e. the forks with 1Gb couldn't handle the load

bstopp commented 5 years ago

The AEM jar you start with (the quickstart) can be run directly. This is where the -nofork option is available. One of the other options is -unpack, which extracts the standalone jar. Here's a tree view:

├── aem-quickstart-6.5.0.jar
└── crx-quickstart
    ├── app
    │   └── cq-quickstart-6.5.0-standalone-quickstart.jar
    ├── bin

The standalone doesn't have the same start options as the quickstart. AFAIK it can't be forked.

Running the quickstart shows this is the stderr.log file:

Rogue:logs bryan$ cat stderr.log Low-memory action set to fork Using 64bit VM settings, min.heap=1024MB, min permgen=256MB, default fork arguments=[-Xmx1024m, -XX:MaxPermSize=256m] The JVM reports a heap size of 3641 MB, meets our expectation of 1024 MB +/- 20 ...

Whereas there's not output anywhere like this when starting the standalone jar.

The standalone jar is what the Puppet module runs when it starts AEM.

HTH.

cliffano commented 5 years ago

@bstopp Thanks for the explanation.

I've confirmed that the original problem did occur when quickstart jar (non-standalone) was used, hence the fork(s) was created.

So this means an ideal solution to the '1Gb memory limitation on the forks' was to actually switch to quickstart-standalone jar instead of running the quickstart jar (non-standalone) with -nofork.

@bstopp Not specific to the performance problem, but wouldn't this PR still be handy for passing custom start opts which are available on the standalone jar? start.erb currently only covers some of the available opts.

bstopp commented 5 years ago

I could add more options, and i have open issues labeled as questions about shoudl they be added. Here's the available options:

    start         listen for control connection (uses -j)
    stop          terminate running Apache Sling (uses -j)
    status        check whether Apache Sling is running (uses -j)
    threads       request a thread dump from Apache Sling (uses -j)
    -j adr        host and port to use for control connection in the format '[host:]port' (default 127.0.0.1:0)
    -l loglevel   the initial loglevel (0..4, FATAL, ERROR, WARN, INFO, DEBUG)
    -f logfile    the log file, "-" for stdout (default logs/error.log)
    -c slinghome  the sling context directory (default sling)
    -i launchpadhome  the launchpad directory (default slinghome)
    -a address    the interfact to bind to (use 0.0.0.0 for any)
    -p port       the port to listen to (default 8080)
    -r path       the root servlet context path for the http service (default is /)
    -n            don't install the shutdown hook
    -Dn=v         sets property n to value v. Make sure to use this option *after* the jar filename. The JVM also has a -D option which has a different meaning
    -h            prints this usage message

These are already supported:

    -c slinghome  the sling context directory (default sling)
    -i launchpadhome  the launchpad directory (default slinghome)
    -r path       the root servlet context path for the http service (default is /)

These don't make sense to support, IMO:

    -j adr        host and port to use for control connection in the format '[host:]port' (default 127.0.0.1:0)
    -l loglevel   the initial loglevel (0..4, FATAL, ERROR, WARN, INFO, DEBUG)
    -f logfile    the log file, "-" for stdout (default logs/error.log)
    -n            don't install the shutdown hook

That just leaves these:

    -a address    the interfact to bind to (use 0.0.0.0 for any)
    -Dn=v         sets property n to value v. Make sure to use this option *after* the jar filename. The JVM also has a -D option which has a different meaning