OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.13k stars 580 forks source link

io.openliberty.cdi.4.0.internal.services.fragment bundle cannot resolve dynamically against the host bundle #26680

Closed tjwatson closed 5 months ago

tjwatson commented 8 months ago

Related to #26599

The CDI 4.0 feature has a host bundle io.openliberty.jakarta.cdi.4.0 that is included in a private feature io.openliberty.jakarta.cdi-4.0. This feature is included in some other features that need the API for CDI but do not want to enable the full CDI implementation. The issue is when the public cdi-4.0 feature is enabled dynamically when the private feature io.openliberty.jakarta.cdi-4.0 is already enabled then the fragment io.openliberty.cdi.4.0.internal.services.fragment bundle cannot resolve dynamically. This is a limitation of OSGi because the fragment introduces a new requirement for the imported package org.jboss.weld.lite.extension.translator.

PR #26599 forces the host to re-resolve (stop/refresh/start) so it gets a new classloader. This will cause loads of other bundles to also get refreshed in the chain of dependencies on CDI. This should "just work" but other parts of the system seem to hold onto stale references which leads to class cast exceptions like the following:

Exception = java.lang.ClassCastException
Source = com.ibm.ws.webcontainer.servlet.ServletWrapper.destroy
probeid = 403
Stack Dump = java.lang.ClassCastException: class com.ibm.ws.webcontainer.security.metadata.SecurityServletConfiguratorHelper cannot be cast to class com.ibm.ws.webcontainer.security.metadata.SecurityMetadata (com.ibm.ws.webcontainer.security.metadata.SecurityServletConfiguratorHelper is in unnamed module of loader org.eclipse.osgi.internal.loader.EquinoxClassLoader @344202d8; com.ibm.ws.webcontainer.security.metadata.SecurityMetadata is in unnamed module of loader org.eclipse.osgi.internal.loader.EquinoxClassLoader @40cad1a9)
        at com.ibm.ws.webcontainer.security.util.WebConfigUtils.getSecurityMetadata(WebConfigUtils.java:69)
        at com.ibm.ws.webcontainer.security.WebAppSecurityCollaboratorImpl.getSecurityMetadata(WebAppSecurityCollaboratorImpl.java:1429)
        at com.ibm.ws.webcontainer.security.admin.internal.WebAdminSecurityCollaboratorImpl.getSecurityMetadata(WebAdminSecurityCollaboratorImpl.java:132)
        at com.ibm.ws.webcontainer.security.WebAppSecurityCollaboratorImpl.preInvoke(WebAppSecurityCollaboratorImpl.java:624)
        at com.ibm.ws.webcontainer.security.WebAppSecurityCollaboratorImpl.preInvoke(WebAppSecurityCollaboratorImpl.java:1061)
        at com.ibm.ws.webcontainer.servlet.ServletWrapper.doDestroy(ServletWrapper.java:1006)
        at com.ibm.ws.webcontainer.servlet.ServletWrapper.destroy(ServletWrapper.java:1156)
        at com.ibm.ws.webcontainer.osgi.servlet.ServletWrapper.destroy(ServletWrapper.java:95)
        at com.ibm.ws.webcontainer.webapp.WebApp.destroy(WebApp.java:4014)
        at com.ibm.ws.webcontainer.osgi.webapp.WebApp.destroy(WebApp.java:1470)
        at com.ibm.ws.container.AbstractContainer.destroy(AbstractContainer.java:81)
        at com.ibm.ws.webcontainer.webapp.WebGroup.destroy(WebGroup.java:217)
        at com.ibm.ws.webcontainer.webapp.WebGroup.removeWebApplication(WebGroup.java:258)
        at com.ibm.ws.webcontainer.VirtualHost.removeWebApplication(VirtualHost.java:274)
        at com.ibm.ws.webcontainer.VirtualHost.removeWebApplication(VirtualHost.java:251)
        at com.ibm.ws.webcontainer.WebContainer.removeWebApplication(WebContainer.java:602)
        at com.ibm.ws.webcontainer.osgi.WebContainer.removeModule(WebContainer.java:1293)
        at com.ibm.ws.webcontainer.osgi.WebContainer.stopModule(WebContainer.java:1257)
        at com.ibm.ws.webcontainer.osgi.WebContainer.stopModule(WebContainer.java:1222)
        at com.ibm.ws.app.manager.module.internal.ModuleHandlerBase.undeployModule(ModuleHandlerBase.java:135)
        at com.ibm.ws.app.manager.module.internal.DeployedModuleInfoImpl.uninstallModule(DeployedModuleInfoImpl.java:66)
        at com.ibm.ws.app.manager.module.internal.SimpleDeployedAppInfoBase.uninstallApp(SimpleDeployedAppInfoBase.java:678)
        at com.ibm.ws.app.manager.wab.internal.WABInstaller$WABDeployedAppInfo.uninstallApp(WABInstaller.java:1585)
        at com.ibm.ws.app.manager.wab.internal.WABInstaller.uninstallFromWebContainer(WABInstaller.java:595)
        at com.ibm.ws.app.manager.wab.internal.WAB.removeFromWebContainer(WAB.java:463)
        at com.ibm.ws.app.manager.wab.internal.WAB.removeWAB(WAB.java:615)
        at com.ibm.ws.app.manager.wab.internal.WABGroup.uninstallGroup(WABGroup.java:70)
        at com.ibm.ws.app.manager.wab.internal.WABInstaller.deactivate(WABInstaller.java:336)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:245)
        at org.apache.felix.scr.impl.inject.methods.BaseMethod.access$500(BaseMethod.java:41)
        at org.apache.felix.scr.impl.inject.methods.BaseMethod$Resolved.invoke(BaseMethod.java:687)

Seems something down in the webcontainer security is holding onto stale references when the CDI API bundles are refreshed. While that could be addressed it may be best to figure out if the CDI implementation can be installed dynamically without forcing the host CDI API bundle to be refreshed.

Azquelt commented 8 months ago

From our discussion:

The fragment in particular causes problems due to the need to re-resolve it if a feature which depends on the CDI API feature is started and then the CDI feature is started later, but the issues run a little deeper than this.

The CDI API has two places where an API class needs to have access to an instance from the implementation: CDI (which holds a reference to a CDIProvider instance) and BuildServicesResolver (holds BuildServices).

For BuildServicesResolver, the BuildServices instance is only loaded once. If the CDI feature is enabled, then disabled, then re-enabled, the CDI implementation bundles will be installed a second time, so we should be using a new instance of BuildServices. Continuing to use the old one is likely to eventually result in ClassCastExceptions when new and old classes mix, or if the old version of the class has stale references which no longer work.

Ideally, we either need the API to allow us to unset and re-set the instance when the CDI feature is stopped and started, or we need to set an instance which dynamically delegates to the real instance.