CRaC / example-jetty

0 stars 8 forks source link

Segmentation fault #6

Closed CristianZatt closed 2 years ago

CristianZatt commented 2 years ago

Tried out this example on Ubuntu 22.04 VM, used both, jdk17-crac+1 and jdk17-crac+2. All fine until "$JAVA_HOME/bin/java -XX:CRaCRestoreFrom=cr" resulting in Segmentation fault and being unable to restore the application.

Also tried on my application, following the step-by-step guide and the result was the same. Image files generated, but unable to restore due to segmentation fault.

Used maven 3.6.3 running on the provided 17-crac JDKs to package.

On the presentations, it was restored by running a restore.sh script, is there something missing from the documentation, or is there a problem currently to restore the applications?

AntonKozlov commented 2 years ago

Hmm. Thanks for the report! I'll set up a VM and try this out.

AntonKozlov commented 2 years ago

This seems to be caused by our older CRIU version without https://github.com/checkpoint-restore/criu/pull/1727. I'll try to merge recent changes to the CRaC's CRIU.

AntonKozlov commented 2 years ago

Could you check out a preliminary build? https://github.com/AntonKozlov/openjdk-builds/actions/runs/2737755453 If it works fine, I'll integrate changes to the next CRaC build.

CristianZatt commented 2 years ago

Sure, will test when I get home after work and let you know.

rbygrave commented 2 years ago

FYI:

I also get segmentation fault with the prior build (17.0+2) (build 17-crac+2-10). With the jvm 17-crac+0-9 build from the link above, I no longer get segmentation fault but it isn't successful. I instead see:

CR: Checkpoint ...
JVM: invalid info for restore provided (may be failed checkpoint)

And trying to run it:

$JAVA_HOME/bin/java -XX:CRaCRestoreFrom=cr
Error (criu/protobuf.c:72): Unexpected EOF on (empty-image)

Steps

$ export JAVA_HOME=~/apps/crac/jdk-crac-9

$ mvn --version
Apache Maven 3.8.5 (3599d3414f046de2324203b78ddcf9b5e4388aa0)
Maven home: /home/rob/app/maven
Java version: 17-crac, vendor: N/A, runtime: /home/rob/apps/crac/jdk-crac-9
Default locale: en_NZ, platform encoding: UTF-8
OS name: "linux", version: "5.4.0-122-generic", arch: "amd64", family: "unix"

$ mvn clean package
...

$ sudo rm -rf cr

$ $JAVA_HOME/bin/java -XX:CRaCCheckpointTo=cr -jar target/example-jetty-1.0-SNAPSHOT.jar
2022-07-29 11:46:25.070:INFO::main: Logging initialized @120ms to org.eclipse.jetty.util.log.StdErrLog
2022-07-29 11:46:25.123:INFO:oejs.Server:main: jetty-9.4.30.v20200611; built: 2020-06-11T12:34:51.929Z; git: 271836e4c1f4612f12b7bb13ef5a92a927634b0d; jvm 17-crac+0-9
2022-07-29 11:46:25.155:INFO:oejs.AbstractConnector:main: Started ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2022-07-29 11:46:25.158:INFO:oejs.Server:main: Started @218ms
2022-07-29 11:46:41.576:INFO:oejs.AbstractConnector:Thread-9: Stopped ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
CR: Checkpoint ...
JVM: invalid info for restore provided (may be failed checkpoint)
2022-07-29 11:46:42.072:INFO:oejs.Server:Thread-9: jetty-9.4.30.v20200611; built: 2020-06-11T12:34:51.929Z; git: 271836e4c1f4612f12b7bb13ef5a92a927634b0d; jvm 17-crac+0-9
2022-07-29 11:46:42.074:INFO:oejs.AbstractConnector:Thread-9: Started ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2022-07-29 11:46:42.074:INFO:oejs.Server:Thread-9: Started @17134ms

With curl localhost:8080 in there after the server is started and the $JAVA_HOME/bin/jcmd target/example-jetty-1.0-SNAPSHOT.jar JDK.checkpoint to trigger the checkpoint.

AntonKozlov commented 2 years ago

Could you check and post cr/dump4.log? If it contains messages about insufficient permissions, then just try to re-extract jdk with sudo. https://github.com/CRaC/docs#jdk.

rbygrave commented 2 years ago

then just try to re-extract jdk with sudo. https://github.com/CRaC/docs#jdk.

That was my problem. I didn't extract with sudo!! All working now:

$JAVA_HOME/bin/java -XX:CRaCCheckpointTo=cr -jar target/example-jetty-1.0-SNAPSHOT.jar
2022-07-29 23:23:19.924:INFO::main: Logging initialized @86ms to org.eclipse.jetty.util.log.StdErrLog
2022-07-29 23:23:19.967:INFO:oejs.Server:main: jetty-9.4.30.v20200611; built: 2020-06-11T12:34:51.929Z; git: 271836e4c1f4612f12b7bb13ef5a92a927634b0d; jvm 17-crac+0-9
2022-07-29 23:23:19.998:INFO:oejs.AbstractConnector:main: Started ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2022-07-29 23:23:20.000:INFO:oejs.Server:main: Started @172ms

2022-07-29 23:23:37.192:INFO:oejs.AbstractConnector:Thread-9: Stopped ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
CR: Checkpoint ...
Killed

And then

$JAVA_HOME/bin/java -XX:CRaCRestoreFrom=cr
2022-07-29 23:24:55.718:INFO:oejs.Server:Thread-9: jetty-9.4.30.v20200611; built: 2020-06-11T12:34:51.929Z; git: 271836e4c1f4612f12b7bb13ef5a92a927634b0d; jvm 17-crac+0-9
2022-07-29 23:24:55.721:INFO:oejs.AbstractConnector:Thread-9: Started ServerConnector@4aa8f0b4{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2022-07-29 23:24:55.721:INFO:oejs.Server:Thread-9: Started @95893ms

Awesome.

Side note: I own a few libraries that could support CRaC - My jdbc connection pool already has offline()/online() support. I have an abstraction over Jetty + a dependency injection library ... so I can certainly play around with adding in the lifecycle support needed for this.

CristianZatt commented 2 years ago

Yep, worked here on my VM too. Will try to apply it on some of our applications to see if we can get a good result for faster deploys on the development environment.

Nice project! Is there any road map or perspective to make this production ready?

rbygrave commented 2 years ago

Just to put out my own thoughts re getting more prod ready:

AntonKozlov commented 2 years ago

@rbygrave @CristianZatt Thanks for checking! I'll include changes in the future CRaC build.

AntonKozlov commented 2 years ago

@CristianZatt There is no schedule yet for production readiness. We need to comply with the OpenJDK quality standards, which cannot be done quickly, unfortunately.

AntonKozlov commented 2 years ago

@rbygrave It would be interesting to hear about experience with CRaC API. Org.crac library may help to use the API in real-world projects. https://github.com/CRaC/docs#orgcrac

Compression is indeed useful and requested feature. We are prioritizing that to be done sooner. We have that implemented by external scripts in the deployment of https://github.com/CRaC/example-lambda and planning how to do that in the VM.

Regarding pre-prod vs canary environment, checkpoint in the canary environment may provide better results as the profile and compilations will be captured by the image. Would that work for you?