OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.16k stars 598 forks source link

checkpoint app with sessionCache-1.0 fails to run on OCP #28670

Open tam512 opened 5 months ago

tam512 commented 5 months ago

Test Quicksec SVT application with <feature>sessionCache-1.0</feature>. It is configured to run with httpSessionCache and it runs ok on OCP without instantOn checkpoint. After perform checkpoint and deploy it to OCP, the app fails to run. I tried with checkpoint at beforeAppStart and afterAppStart and they both fail to run.

I have the following in bootstrap.properties when checkpoint to workaround the non-support full EJB support with instantOn

io.openliberty.checkpoint.allowed.features=enterpriseBeansRemote-4.0,enterpriseBeansPersistentTimer-4.0,enterpriseBeansHome-4.0,enterpriseBeans-4.0

Errors in messages.log:

[6/6/24, 14:06:41:721 UTC] 00000037 com.ibm.ws.kernel.launch.internal.FrameworkManager           A Launching defaultServer (WebSphere Application Server 24.0.0.6/wlp-1.0.90.cl240620240603-2001) on Eclipse OpenJ9 VM, version 17.0.11+9 (en_US)
...........
........................
[6/6/24, 14:08:47:162 UTC] 00000072 yoko.verbose.request.in                                      W Servant method raised a non-CORBA exception
    Client receives this exception as CORBA::UNKNOWN
    operation name: "_get_messagePermitAll"
    transport info: null
    exception: java.lang.NullPointerException: Cannot invoke "java.lang.Class.newInstance()" because "tieClass" is null
    at com.ibm.ws.ejbcontainer.remote.internal.EJBServantLocatorImpl.createTie(EJBServantLocatorImpl.java:95)
    at com.ibm.ws.ejbcontainer.remote.internal.EJBServantLocatorImpl.getServant(EJBServantLocatorImpl.java:81)
    at com.ibm.ws.ejbcontainer.remote.internal.EJBServantLocatorImpl.preinvoke(EJBServantLocatorImpl.java:63)
    at org.apache.yoko.orb.OBPortableServer.ServantLocatorStrategy.preinvoke(ServantLocatorStrategy.java:101)
    at org.apache.yoko.orb.OBPortableServer.NonRetainStrategy.locate(NonRetainStrategy.java:105)
    at org.apache.yoko.orb.OBPortableServer.POA_impl._OB_dispatch(POA_impl.java:1219)
    at org.apache.yoko.orb.OB.DispatchRequest_impl.invoke(DispatchRequest_impl.java:56)
    at com.ibm.ws.transport.iiop.yoko.ExecutorDispatchStrategy$1.run(ExecutorDispatchStrategy.java:44)
    at com.ibm.ws.threading.internal.ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:280)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:857)

Error on the browser when clicking on "Go & Get..." button on the quicksec app

Exception: jakarta.ejb.EJBException: nested exception is: java.rmi.RemoteException: CORBA UNKNOWN 0 No; nested exception is: org.omg.CORBA.UNKNOWN: : vmcid: 0x0 minor code: 0x0 completed: No
tjwatson commented 5 months ago

This should probably be transferred to the open-liberty repo

tjwatson commented 5 months ago

Can you post your steps to reproduce, including the a server.xml and the application?

tam512 commented 5 months ago

The QuickSec application repo is here

FROM $INFINISPAN_IMAGE AS infinispan-client

USER root

RUN infinispan-client-setup.sh - gets old files so running the commands individually

RUN set -Eeox pipefail RUN yum update -y RUN yum install -y maven RUN mkdir -p /opt/ibm/wlp/usr/shared/resources/infinispan RUN echo ' 4.0.0 io.openliberty openliberty-infinispan-client 1.0 org.infinispan infinispan-jcache-remote 14.0.27.Final ' > /opt/ibm/wlp/usr/shared/resources/infinispan/pom.xml RUN mvn -f /opt/ibm/wlp/usr/shared/resources/infinispan/pom.xml versions:use-latest-releases -DallowMajorUpdates=false RUN mvn -f /opt/ibm/wlp/usr/shared/resources/infinispan/pom.xml dependency:copy-dependencies -DoutputDirectory=/opt/ibm/wlp/usr/shared/resources/infinispan RUN yum remove -y maven RUN rm -f /opt/ibm/wlp/usr/shared/resources/infinispan/pom.xml RUN rm -f /opt/ibm/wlp/usr/shared/resources/infinispan/jboss-transaction-api*.jar RUN rm -rf ~/.m2 RUN chown -R 1001:0 /opt/ibm/wlp/usr/shared/resources/infinispan RUN chmod -R g+rw /opt/ibm/wlp/usr/shared/resources/infinispan USER 1001

FROM $BASE_IMAGE AS open-liberty-infinispan

COPY --chown=1001:0 --from=infinispan-client /opt/ibm/wlp/usr/shared/resources/infinispan /config/datagrid

ENV INFINISPAN_SERVICE_NAME=datagrid ENV INFINISPAN_PASS=datagridPass

set env variable so it won't fall back to default server start if restore fails

ENV CRIU_RESTORE_DISABLE_RECOVERY=true

COPY --chown=1001:0 ./QuickSec/target/QuickSec.ear /config/apps/ COPY --chown=1001:0 config/server.xml /config/server.xml COPY --chown=1001:0 config/jvm.options /config/jvm.options COPY --chown=1001:0 config/datagrid.xml /config/datagrid.xml COPY --chown=1001:0 config/ldap-config.xml /config/ldap-config.xml COPY --chown=1001:0 config/bootstrap.properties /config/bootstrap.properties

truststore for LDAP

COPY --chown=1001:0 config/openldap.p12 /config/openldap.p12 COPY --chown=1001:0 config/nest-ldap.p12 /config/nest-ldap.p12

DB2 files

COPY --chown=1001:0 ./db2jars /config/db2jars

DATAGRID files

COPY --chown=1001:0 ./datagrid /config/datagrid

for WebSphere Liberty

COPY --chown=1001:0 featureUtility.properties /opt/ibm/wlp/etc/featureUtility.properties

Setting for the verbose option

ARG VERBOSE=true ARG FULL_IMAGE=false

This script will add the requested XML snippets to enable Liberty features and grow image to be fit-for-purpose using featureUtility.

Only available in 'kernel-slim'. The 'full' tag already includes all features for convenience.

RUN if [ "$FULL_IMAGE" = "true" ] ; then echo "Skip running features.sh for full image" ; else features.sh ; fi

Add interim fixes for WL/OL (optional)

COPY --chown=1001:0 interim-fixes /opt/ol/fixes/ COPY --chown=1001:0 interim-fixes /opt/ibm/fixes/

This script will add the requested XML snippets and grow image to be fit-for-purpose

RUN configure.sh

remove infinispan-client-sessioncache.xml generated by onfigure.sh to avoid conflict with DataGrid configuration that is present for quicksec

USER root RUN rm -rf /config/configDropins/overrides/infinispan-client-sessioncache.xml USER 1001

RUN checkpoint.sh beforeAppStart

RUN checkpoint.sh afterAppStart

- Run checkpoint

podman build -t qs10-beforeappstart:wl-full-java17-x86 --cap-add=CHECKPOINT_RESTORE --cap-add=SYS_PTRACE --cap-add=SETPCAP --security-opt seccomp=unconfined -f Containerfile --no-cache --volume /opt/liberty-mavenartf:/opt/libertyrepo .

- push the _qs10-beforeappstart:wl-full-java17-x86_ image to some repository that can be pulled from OCP 
- On OCP with rook-cephfs  storageclass , install RH Datagrid Operator
- Install WebSphere Liberty Operator
- Create a namespace to work on this app 
- deploy Infinispan using yamls under `deploy/datagrid` folder
- Deploy Ldap container (`deploy/svt-ldap.yaml`)
- Deploy DB2 container (`deploy/db2`)
- create service account and security context per this slack https://ibm-cloud.slack.com/archives/C03MR7EC3NG/p1693408875306839?thread_ts=1693328418.096989&cid=C03MR7EC3NG
- Update `spec.applicationImage` and add the following to  `deploy/05-app-deploy-wlo.yaml` and deploy the app

serviceAccountName: instanton-sa securityContext: allowPrivilegeEscalation: true privileged: false runAsNonRoot: true capabilities: add: