OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.15k stars 590 forks source link

InstantOn 24.0.0.8: Restore fails on OCP Power v9 with errors not user friendly #29240

Open abdulmateen-1 opened 2 months ago

abdulmateen-1 commented 2 months ago

Describe the bug
Restore fails on OCP Power v9 when checkpoint is taken on an image that has SELinux set to permissive and the error details shown doesn't clearly document what the user should do in this case.

(00.059541) pie: 1025:  mmap(0x7a1183240000 -> 0x7a1183250000, 0x3 0x12 398)
(00.059548) pie: 1025:  mmap(0x7a1183250000 -> 0x7a11832b0000, 0x7 0x12 399)
(00.059554) pie: 1025:  mmap(0x7a11832b0000 -> 0x7a11832c0000, 0x3 0x12 399)
(00.059560) pie: 1025:  mmap(0x7a11832c0000 -> 0x7a11832d0000, 0x3 0x12 399)
(00.059567) pie: 1025:  mmap(0x7a11832d0000 -> 0x7a11833a0000, 0x3 0x32 -1)
(00.059573) pie: 1025:  mmap(0x7a11833a0000 -> 0x7a11833b0000, 0x0 0x32 -1)
(00.059578) pie: 1025:  mmap(0x7a11833b0000 -> 0x7a11833f0000, 0x3 0x32 -1)
(00.059583) pie: 1025:  mmap(0x7a11833f0000 -> 0x7a1183400000, 0x1 0x11 400)
(00.059717) pie: 1025: Error (criu/pie/restorer.c:1676): Can't restore 0x7a11833f0000 mapping with 0xfffffffffffffff3
(00.059728) pie: 1025: Error (criu/pie/restorer.c:2102): Restorer fail 1025
(00.059965) Error (criu/cr-restore.c:2547): Restoring FAILED.

Expected behavior
In a similar issue https://github.com/OpenLiberty/open-liberty/issues/24522, It was resolved into a user friendly error that explains what the user should do.

Diagnostic information:

tjwatson commented 2 months ago

Restore fails on OCP Power v9 when checkpoint is taken on an image that has SELinux set to permissive and the error details shown doesn't clearly document what the user should do in this case.

What is the SELinux setting on the machine doing the checkpoint and the machine doing the restore? It appears the checkpoint successfully happened, but the restore failed. I assume you had update the settings on the machine doing the restore to make it work?

abdulmateen-1 commented 2 months ago

The checkpoint was taken on an EBC machine which was provisioned on demand and since it's an Ubuntu machine I think it has SELinux set to permissive or not configured at all. Yes I had to update the settings to get the restore to work.