projectbuendia / buendia

Main project repository (see the Wiki for details)
Apache License 2.0
117 stars 36 forks source link

Use Java options to prevent slow Tomcat startup #209

Closed schuyler closed 5 years ago

schuyler commented 5 years ago

We've experienced slow server startup times in the past, due to a fairly well-known issue with Catalina blocking while looking for randomness.

In 1562e92, we tackled this issue by replacing /dev/random with /dev/urandom; however, this approach has two disadvantages: One, the PRNG devices are replaced by Debian on boot; two, it addresses a JVM "bug" by making a system-wide change with possible unintended consequences.

This PR addresses the problem a little more directly by explicitly telling the JVM which PRNG to use when starting Catalina. The use of /dev/./urandom (sic) apparently sidesteps some cleverness internal to the JVM that we evidently don't want.

schuyler commented 5 years ago

Note: I've tested this change by installing a dev package on Vagrant and confirming that the option appears in the JVM command-line:

$ ps auwx | grep java
tomcat7   1744 62.0 62.2 1762996 313828 ?      Sl   22:18   1:03 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.util.logging.config.file=/var/lib/tomcat7/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.awt.headless=true -Xmx256m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Djava.security.egd=file:/dev/./urandom -Djava.endorsed.dirs=/usr/share/tomcat7/endorsed -classpath /usr/share/tomcat7/bin/bootstrap.jar:/usr/share/tomcat7/bin/tomcat-juli.jar -Dcatalina.base=/var/lib/tomcat7 -Dcatalina.home=/usr/share/tomcat7 -Djava.io.tmpdir=/tmp/tomcat7-tomcat7-tmp org.apache.catalina.startup.Bootstrap start

In theory this ought to solve the issue. We will have to keep an eye out and see if the slow startup times persist.

zestyping commented 5 years ago

If this works, it's a much better solution than the symlink.

Do we know that this command-line option causes the change we want in the OpenMRS server startup process? For example, could you run a startup without this option, replicate the delay, then re-run it with this option and compare the log output?

schuyler commented 5 years ago

I was able to repro the PRNG starvation issue by rebooting the NUC after a fresh install and configure, confirming that Debian had replaced /dev/random again, and then running while true; do time dd if=/dev/random of=/dev/null bs=1G count=1; done in one tty, while restarting the server and watching catalina.out in another.

I was able to produce Creation of SecureRandom instance ... took [367,325] milliseconds in the log. This was the bulk of the >6m start time for the Tomcat server. Watching the times on dd, I could confirm that large reads from /dev/random were slowing into the dozens of seconds.

With the shell loop still eating /dev/random I added the CATALINA_OPTS line from this PR, and stopped and started tomcat cold. From this point the Creation of SecureRandom instance line stops appearing in catalina.out, and start time returned to the range of 54-55000ms.

I restarted tomcat cold several times to confirm, with consistent server startup times. I rebooted the NUC, and was able again to confirm 55s start times for Tomcat.

Thanks for pushing me to confirm this result. I believe it is verified.

zestyping commented 5 years ago

Fantastic! Thanks for the thorough testing.