naver / ngrinder

enterprise level performance testing solution
naver.github.io/ngrinder
Apache License 2.0
2k stars 478 forks source link

Script validation error #848

Closed mashuangwei closed 1 year ago

mashuangwei commented 2 years ago

1. Describe the bug 🐞

After the platform runs for one month, the verification script will report an error. In fact, if the script has no problem, you can only restart the platform and the verification script function can return to normal

Error message in script verification: Error: Could not find or load main class net.grinder.engine.process.WorkerProcessEntryPoint

3. Environment

4. Screenshots

If necessary, add screenshots to help explain your problem.

donggyu04 commented 2 years ago

@mashuangwei

Could you restart controller?

mashuangwei commented 2 years ago

Yes, it's OK when I restart it, but there will be problems after running for one month. It can only be solved by restarting

imbyungjun commented 2 years ago

I reproduced the error. Also, it occured after running the ngrinder-controller about a month.

The ngrinder-controller log was like below

2022-08-24 16:23:03,097 INFO  LocalScriptTestDriveService.java:115 : 
grinder.jvm.classpath : /tmp/ngrinder.war-spring-boot-libs-061d2b83-cccb-4760-a0ad-abbd63b5cc3d/ngrinder-runtime-3.5.5-p2.jar
::/tmp/ngrinder.war-spring-boot-libs-061d2b83-cccb-4760-a0ad-abbd63b5cc3d/grinder-3.9.1-patch.jar
::/home/user/.ngrinder/script/admin
2022-08-24 16:23:03,099 INFO  LocalScriptTestDriveService.java:132 : jvm args :  -Djna.library.path=/home/user/.ngrinder/script/admin/lib  -Dpython.path=/home/user/.ngrinder/script/admin/lib  -Dpython.cachedir=/tmp/jython  -Dngrinder.etc.hosts=dev-fin-ngrinder-ctrl-001-ncl.nfra.io:127.0.0.1,localhost:127.0.0.1 -Dngrinder.enable.local-dns=true  -Duser.dir=/home/user/.ngrinder/script/admin  -Dngrinder.context=controller  -Dhttps.protocols=TLSv1.3,TLSv1.2,TLSv1.1,TLSv1,SSLv3  -Djsse.enableSNIExtension=false
2022-08-24 16:23:03,107 INFO  ErrorStreamRedirectWorkerLauncher.java:127 : worker validation-0 started

And the validation result was below.

Error: Could not find or load main class net.grinder.engine.process.WorkerProcessEntryPoint

It is an error on the worker process, not the controller side. So, this error can not be caught on the controller code. The controller starts worker process normally. However, the result of worker process is not the expected one.

I tried to find the library files that specified on jvm classpath. In my case, it was /tmp/ngrinder.war-spring-boot-libs-061d2b83-cccb-4760-a0ad-abbd63b5cc3d/ngrinder-runtime-3.5.5-p2.jar and /tmp/ngrinder.war-spring-boot-libs-061d2b83-cccb-4760-a0ad-abbd63b5cc3d/grinder-3.9.1-patch.jar. When I check the /tmp directory, I could figure out why the error occurs after running the controller for a month. There was no /tmp/ngrinder.war-spring-boot-libs-061d2b83-cccb-4760-a0ad-abbd63b5cc3d directory. The default period of cleaning up /tmp directory is 30 days. After 30 days, the library files for nGrinder in the /tmp directory cleaned up, then the worker process can not find libraries depends on.

This issue can be occurred when running ngrinder-controller directly with the .war file(without using Tomcat) for more than 30 days. I'm gonna try to resolve this issue.

donggyu04 commented 2 years ago

@imbyungjun 😨

imbyungjun commented 2 years ago

According to this, I think that the issue should be handled by adding the system property java.io.tmpdir. Change the temp directory to one that would not be removed.

Follow one of the workarounds.

  1. Run the .war file with the system property java.io.tmpdir. (The directory could be anywhere you want)
    java -Djava.io.tmpdir=/home/user/.ngrinder/tmp -jar ngrinder-controller/build/libs/ngrinder-controller-3.5.5-p1.war
  2. Run the ngrinder-controller with Tomcat.
imbyungjun commented 2 years ago

How are the libraries unpacked?

nGrinder accesses to dependent libraries on runtime for some features. Libraries are need to accessible as .jar file on file system. So, nGrinder uses bootWar - requiresUnpack option to un-packaging all the dependent libraries. How the requiresUnpack option works is describe in the picture below.

bootjarunpack

First, build step.

Adds a ZIP comment to the .jar files. The format of the comment is UNPACK:${sha1hash}. The ZIP comments can be checked using unzip like below

$ unzip -l ngrinder-controller-3.5.5-p1.war 
Archive:  ngrinder-controller-3.5.5-p1.war
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  09-27-2022 14:30   META-INF/
      346  09-27-2022 14:30   META-INF/MANIFEST.MF
        0  02-01-1980 00:00   org/
        0  02-01-1980 00:00   org/springframework/
...
    25058  08-18-2022 18:34   WEB-INF/lib-provided/jakarta.annotation-api-1.3.5.jar
UNPACK:59eb84ee0d616332ff44aba065f3888cf002cd2d
   268755  08-22-2022 12:00   WEB-INF/lib-provided/tomcat-embed-websocket-9.0.37.jar
UNPACK:ee8b7c9081372bf40c41443c93317145a01e343a
  3383603  08-22-2022 12:00   WEB-INF/lib-provided/tomcat-embed-core-9.0.37.jar
UNPACK:c3f788de87f17eb57a9e7083736c1820fcbc1046
   237826  08-22-2022 12:00   WEB-INF/lib-provided/jakarta.el-3.0.3.jar
UNPACK:dab46ee1ee23f7197c13d7c40fce14817c9017df
---------                     -------
157755412                     737 files

Second, execution step.

When execute java -jar command, JVM runs JarLauncher's main method as it specified in the MANIFEST.MF file. And the JarLauncher creates ClassLoader with all the archives include libraries and project classes. At the same time, the launcher checks the dependent jar file has a ZIP comment starts with UNPACK: . If the comment exists, un-packaging the jar to the temporary directory.