eclipse-ee4j / glassfish

Eclipse GlassFish
https://eclipse-ee4j.github.io/glassfish/
382 stars 143 forks source link

[Blocking] Console does not come up on AIX systems when sun-web-app_2_3-0.dtd cannot be fetched. #16890

Closed glassfishrobot closed 13 years ago

glassfishrobot commented 13 years ago

Admin Console does not come up on some AIX systems. We are uncertain at this point what is causing the issue. It shows up with both 32 and 64 JDK installations, secure admin enabled and disabled, when Glassfish is installed as root and non root user. The exceptions are always the same:

[#|2011-06-21T17:57:01.574-0700|INFO|glassfish3.1|com.sun.grizzly.config.Grizzly ServiceListener|_ThreadID=12;_ThreadName=Thread-8;|Listening to REST requests at context: /management/domain|#]

[#|2011-06-21T17:58:10.555-0700|SEVERE|glassfish3.1|javax.enterprise.system.cont ainer.web.com.sun.enterprise.glassfish.web|_ThreadID=11;_ThreadName=Thread-8;|ja va.net.ConnectException: A remote host did not respond within the timeout period .|#]

(...)

[#|2011-06-21T17:59:33.215-0700|SEVERE|glassfish3.1|javax.enterprise.resource.we bcontainer.jsf.config|_ThreadID=11;_ThreadName=Thread-8;|Critical error during d eployment: com.sun.faces.config.ConfigurationException: Source Document: jndi:/__asadmin/WEB-INF/faces-config.xml Cause: Unable to find class 'com.sun.webui.jsf.faces.UIComponentELResolver' at com.sun.faces.config.processor.AbstractConfigProcessor.createInstance (AbstractConfigProcessor.java:273) at com.sun.faces.config.processor.ApplicationConfigProcessor.addELResolv er(ApplicationConfigProcessor.java:574) at com.sun.faces.config.processor.ApplicationConfigProcessor.process(App licationConfigProcessor.java:301) at com.sun.faces.config.processor.AbstractConfigProcessor.invokeNext(Abs tractConfigProcessor.java:114) at com.sun.faces.config.processor.LifecycleConfigProcessor.process(Lifec ycleConfigProcessor.java:116) at com.sun.faces.config.processor.AbstractConfigProcessor.invokeNext(Abs tractConfigProcessor.java:114) at com.sun.faces.config.processor.FactoryConfigProcessor.process(Factory ConfigProcessor.java:222) at com.sun.faces.config.ConfigManager.initialize(ConfigManager.java:360) at com.sun.faces.config.ConfigureListener.contextInitialized(ConfigureLi stener.java:225) at org.apache.catalina.core.StandardContext.contextListenerStart(Standar dContext.java:4750) at com.sun.enterprise.web.WebModule.contextListenerStart(WebModule.java: 531) at org.apache.catalina.core.StandardContext.start(StandardContext.java:5 366) at com.sun.enterprise.web.WebModule.start(WebModule.java:497) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase .java:917) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:90 1) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:733) at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1 997) at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1 648) at com.sun.enterprise.web.WebApplication.start(WebApplication.java:101) at org.glassfish.internal.data.EngineRef.start(EngineRef.java:130) at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:269) at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.jav a:294) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationL ifecycle.java:462) at com.sun.enterprise.v3.server.ApplicationLoaderService.processApplicat ion(ApplicationLoaderService.java:375) at com.sun.enterprise.v3.admin.adapter.InstallerThread.load(InstallerThr ead.java:210) at com.sun.enterprise.v3.admin.adapter.InstallerThread.run(InstallerThre ad.java:108) Caused by: java.lang.ClassNotFoundException: com.sun.webui.jsf.faces.UIComponent ELResolver at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoade r.java:1519) at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoade r.java:1369) at com.sun.faces.util.Util.loadClass(Util.java:281) at com.sun.faces.config.processor.AbstractConfigProcessor.loadClass(Abst ractConfigProcessor.java:311) at com.sun.faces.config.processor.AbstractConfigProcessor.createInstance (AbstractConfigProcessor.java:240) ... 25 more

| #] |

I'm logging this issue for tracking purposes.

Environment

AIX, IBM JDK 6, both 32 and 64 versions, Glassfish build b04 - b08, first noticed on b08, now reproducible with earlier builds even though not seen before when those were tested.

Affected Versions

[3.1.1_dev]

glassfishrobot commented 6 years ago
glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: You mean on the same system, without any configuration change, using the same build, login to the system as the same person before, you were able to bring up the console. but now, the console failed to start ? Something got to be changed to cause the error. This is really puzzling.

glassfishrobot commented 13 years ago

@glassfishrobot Commented lidiam said: I've searched content of jars under glassfish/modules and glassfish/lib directories but cannot find a jar containing com.sun.webui.jsf.faces.UIComponentELResolver class. Where is it expected to reside?

glassfishrobot commented 13 years ago

@glassfishrobot Commented lidiam said: It seems that once someone gets these exceptions, they are always displayed, even when moving back to Glassfish version that worked previously on the same system. I checked with IT and there have been no changes done to the AIX systems. Also, I compared PATH between a machine where Admin Console still works, with one that does not. I made changes to make the java settings identical, but that did not resolve the issue:

System where things still work:

bash-3.2# echo $PATH /usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/usr/java5/jre/bin:/usr/java5/bin

bash-3.2# ls -l /usr/bin/java lrwxrwxrwx 1 root system 19 May 19 10:11 /usr/bin/java -> /usr/java6/bin/java

bash-3.2# which java /usr/bin/java

bash-3.2# java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pap3260sr9fp1-20110208_03(SR9 FP1)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc-32 jvmap3260sr9-20110203_74623 (JIT enabled, AOT enabled) J9VM - 20110203_074623 JIT - r9_20101028_17488ifx3 GC - 20101027_AA) JCL - 20110203_01

System where Admin Console does not come up:

bash-3.00# echo $PATH /usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/export/sqe/lidia/glassfish3/bin:/export/hudson/tools/ant-1.7.1/bin:.

bash-3.00# ls -l /usr/bin/java lrwxrwxrwx 1 root system 19 Jun 22 13:28 /usr/bin/java -> /usr/java6/bin/java

bash-3.00# which java /usr/bin/java

bash-3.00# java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pap3260sr9fp1-20110208_03(SR9 FP1)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc-32 jvmap3260sr9-20110203_74623 (JIT enabled, AOT enabled) J9VM - 20110203_074623 JIT - r9_20101028_17488ifx3 GC - 20101027_AA) JCL - 20110203_01

glassfishrobot commented 13 years ago

@glassfishrobot Commented lidiam said: Tried Oracle branded and open source versions of Glassfish, also tried rebooting machine. Console fails in all cases, once the initial exception encountered.

glassfishrobot commented 13 years ago

@glassfishrobot Commented sirajg said: Can you try deploying other web applications on systems that are failing to load admin console?

glassfishrobot commented 13 years ago

@glassfishrobot Commented dhirup said: BTW the class exist in my install location in the following jar: ~/work/appserv/v3/3.1.1/aix-install/glassfish3/glassfish/lib/install/applications/__admingui/WEB-INF/lib/webui-jsf-4.0.2.7.jar

glassfishrobot commented 13 years ago

@glassfishrobot Commented lidiam said: And even when the webui jar file was moved to glassfish3/glassfish/lib, console still did not come up with the same error.

Btw, deploying hello.was works fine (comes up no problem).

glassfishrobot commented 13 years ago

@glassfishrobot Commented lidiam said: Raising to Blocker status per Sathyan's request.

glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: 1. Jane pointed me to http://gf-hudson.us.oracle.com/hudson/job/gf-3.1.1-aix-build/ It shows that QL was passing on Friday (build# 281), then, Sahoo integrated

16764.getServletContext() should return the same OSGiServletContext that's used during setAttrbute()."): [osgi-http] request.getSession().getServletContext() should return the same OSGiServletContext that's used during setAttrbute().

16880 if start() has failed"): [osgi-http] NPE in Activator.stop() if start() has failed.

The above bugs are fixed by migrating to osgi-http version 1.0.4 from 1.0.2 (detail)

and then QL starts failing on build# 282, with the exact same stack trace we are all seeing. Of course, this doesn't explain why build 04 which used to be working failed on the same machine without any configuration change.

2. Sahoo suggested adding the jvm option, "-verbose:class" in domain.xml to see if the class is really loaded and from where.

3. If we think this is related to JDK, we should try installing the IBM JDK on a Linux machine, that may isolate if it is due to the JDK or AIX or other factor.

Jane is also working with Sahoo now on this issue.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ss141213 said: The offending code is in WarHandler.java:

public ClassLoader getClassLoader(ClassLoader parent, DeploymentContext context) { WebappClassLoader cloader = new WebappClassLoader(parent); try { FileDirContext r = new FileDirContext(); File base = new File(context.getSource().getURI()); r.setDocBase(base.getAbsolutePath());

cloader.setResources(r); cloader.addRepository("WEB-INF/classes/", new File(base, "WEB-INF/classes/")); if (context.getScratchDir("ejb") != null)

{ cloader.addRepository(context.getScratchDir("ejb").toURI().toURL().toString().concat("/")); }

if (context.getScratchDir("jsp") != null)

{ cloader.setWorkDir(context.getScratchDir("jsp")); }

// add libraries referenced from manifest for (URL url : getManifestLibraries(context))

{ cloader.addRepository(url.toString()); }

WebXmlParser webXmlParser = null; if ((new File(base, GLASSFISH_WEB_XML)).exists())

{ webXmlParser = new GlassFishWebXmlParser(base.getAbsolutePath()); } else if ((new File(base, SUN_WEB_XML)).exists()) { webXmlParser = new SunWebXmlParser(base.getAbsolutePath()); } else if ((new File(base, WEBLOGIC_XML)).exists()) { webXmlParser = new WeblogicXmlParser(base.getAbsolutePath()); } else { webXmlParser = new GlassFishWebXmlParser(base.getAbsolutePath()); }

configureLoaderAttributes(cloader, webXmlParser, base); configureLoaderProperties(cloader, webXmlParser, base);

} catch(MalformedURLException malex) { logger.log(Level.SEVERE, malex.getMessage()); if (logger.isLoggable(Level.FINE))

{ logger.log(Level.FINE, malex.getMessage(), malex); }

} catch(XMLStreamException xse) { logger.log(Level.SEVERE, xse.getMessage()); if (logger.isLoggable(Level.FINE))

{ logger.log(Level.FINE, xse.getMessage(), xse); }

} catch(FileNotFoundException fnfe) { logger.log(Level.SEVERE, fnfe.getMessage()); if (logger.isLoggable(Level.FINE))

{ logger.log(Level.FINE, fnfe.getMessage(), fnfe); }

}

cloader.start();

return cloader; }

If it gets an exception in creating SunWebXmlParser, then it does not add WEB-INF/lib/*.jar to search path of web app class loader. For some reason, it is going to intenet to fetch the dtd while creating the SunWebXmlParser and because of network configuration, it is not able to go to internet. So, it is failing with a ConnectException. Three things to do:

a) Set up the networking layer to use proxy. This will make the test pass. b) Find out why we are going to internet to fetch the dtd instead of using local dtd store. If this is a recent change, that's the culprit. c) Fix WarHandler to fail instead of continuing if SunWebXmlParser can't be created. We don't have time to debug such useless class loading issues.

Thanks to Jane for help in debugging the issue.

glassfishrobot commented 13 years ago

@glassfishrobot Commented oleksiys said: The root problem is that during parsing glassfish3/glassfish/lib/install/applications/__admingui/WEB-INF/sun-web.xml

the xml parser is trying to download dtd from http://www.sun.com/software/sunone/appserver/dtds/sun-web-app_2_3-0.dtd

and this fails (time out expires), because proxy is not set.

glassfishrobot commented 13 years ago

@glassfishrobot Commented tmueller said: More on fetching the DTD. Recently, the www.sun.com URL for the DTD was redirected to a www.oracle.com URL:

http://www.oracle.com/webfolder/technetwork/sun/software/dtd/appserver/sun-web-app_2_3-0.dtd

However, the lab systems within OWAN cannot connect to www.oracle.com without a proxy, and not only that, when they do try to connect, it takes a long time (we've seen 7.5 minutes on glassfish-x86-1) for the connection to fail and then it fails a "no route to host" message.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ap2257 said: Adding "blocking" to the bug synopsis for easy identification and upcoming bug swat meetings.

glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: Thanks everyone for helping to solve this issue. So, this doesn't sound like a GUI specific problem, since any web application may specify this dtd and may end up failing mysteriously without knowing what maybe the cause. I am transferring this to web-container to see if it is possible to log the error so its easer for user to debug the problem. I am also downgrading this from 'blocker'.

Although the redirect should take care of the problem, we will also update the dtd URL to the oracle.com site so we don't rely on the redirect.

glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: Think Alex and I tried to update the issue at the same time, and mine removed his changes. I will change the synopsis and put back the blocker priority since thats what QA thinks it should be.

glassfishrobot commented 13 years ago

@glassfishrobot Commented tmueller said: Anissa, The redirect is not the problem. Lab machines are still able to get to www.sun.com just fine without a proxy. It is the access to www.oracle.com after the redirect that is the problem.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ap2257 said: I just tried by setting the proxy as a general environment setting on my setup as follows:

Proxy Setup

http_proxy=http://www-proxy.us.oracle.com:80/ export http_proxy

but I still see the same problem and I'm not able to bring up the Admin Console. I believe it should come up with my setting if the proxy is the problem. I'm not clear on what the workaround to this problem. I think we still have a problem.

glassfishrobot commented 13 years ago

@glassfishrobot Commented sirajg said: I tried using the workaround by specifying these in the domain.xml :

-Dhttp.proxyHost=www-proxy.us.oracle.com -Dhttp.proxyPort=80

in and the console comes up properly.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ss141213 said: Few observations:

Alex,

Use -Dhttp.proxyHost=www-proxy.us.oracle.com in domain.xml. See [1].

Anissa/Tom,

File a bug against deployment/web container, whoever is responsible for creating *WebXmlParser, to not fetch the dtd from Internet. They should be able to match the dtd from glassfish/lib/dtds. We must NOT assume there is Internet connection available to use admin console or any web app for that matter.

Sahoo

[1] http://download.oracle.com/javase/6/docs/technotes/guides/net/proxies.html

glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: This bug is already under web container, assigned to oleksiys as I thought if the dtd cannot be fetched, error should be logged so user is informed what maybe the problem. But Sahoo's suggestion about fixing the code so that it doesn't fetch the dtd from the internet is even better. GlassFish shouldn't require to have network access in order for any web application to work.

So, I will just leave this bug as it is for web container to fix it by not depending on the network, instead of opening up another bug.

thanks.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ap2257 said: Just to make sure I understand. Siraj and Sahoo's suggestion is only a workaround for this problem. The real solution will happen when the Web Container is fixed. Correct? Can someone confirm?

glassfishrobot commented 13 years ago

@glassfishrobot Commented anilam said: Thats my expectation too. After WebHandler is modified to fetch the dtd locally, instead of going through the network to get it from oracle.com, this bug will then be marked resolved.

glassfishrobot commented 13 years ago

@glassfishrobot Commented @shingwaichan said: I think other DTDs will have the same issue, too. I think the deployment side should try to look at local first before going to internet.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ss141213 said: There are two bugs here, so there should be two bugs filed in the system to track them:

a) web container should not allow the app to be deployed when WarHandler failed to parse sun-web.xml. b) web container and/or deployment backend must not rely on Internet connectivity.

a & #b are independent of each other.

glassfishrobot commented 13 years ago

@glassfishrobot Commented oleksiys said: sorry guys, I'm not looking into this issue reassigning to Shing Wai (as web container owner), so he can dispatch it properly.

glassfishrobot commented 13 years ago

@glassfishrobot Commented @tjquinno said: Shing Wai and Sahoo,

The deployment code which parses descriptors already reads DTDs and schemas from the local copies if local copies exist. (See deployment/dol SaxParserHandler.java).

The only time this will not happen is if the XML document refers to a DTD or schema for which GlassFish does not have a local copy. Then it will try to find the DTD or schema over the network. (In the past the usual cause for this we have seen is if the descriptor contains a misspelling of the system ID or public ID or the schema path.)

It looks as if the parsers in WarHandler might need to invoke setXMLResolver (with a suitable resolver implementation) on the factory before creating the XML reader.

If someone can document a case where deployment's parsing is failing to find a local copy that exists please open a separate issue.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ss141213 said: Tim,

That's what I have been saying. *WebXmlParser in WarHandler does not set the resolver to first search local store for dtds. You can see the code for these classes in WarHandler.java. That's why I asked Anissa & team to file a separate bug for fixing the dtd resolver issue.

Sahoo

glassfishrobot commented 13 years ago

@glassfishrobot Commented @tjquinno said: Sahoo,

You also said "web container and/or deployment backend must not rely on Internet connectivity." Plus Shing-wai earlier said "the deployment side should try to look at local first."

Deployment does not seem to be implicated in this issue, plus its XML parsing does not access the network - unless the requested DTD or schema is not in the local set. I wanted to eliminate any potential confusion your and Shing Wai's comments might have caused so no one was expecting a fix from deployment for this issue.

glassfishrobot commented 13 years ago

@glassfishrobot Commented ss141213 said: Tim,

Did you look at SunWebXmlParser in WarHandler.java? None of those Sun/GlassFish/WebLogicWebXmlParsers instruct the XmlParser to search the local DTD store. Now, deployment or web container team has to figure out who owns that piece of code.

Sahoo

glassfishrobot commented 13 years ago

@glassfishrobot Commented @tjquinno said: Sahoo,

I had already looked at the code, as indicated by my earlier comment.

Given that WarHandler is in the web/war-util module and looking at the svn check-in annotations for those lines in the class, the web container team clearly should own this issue. Your and Shing Wai's mention of deployment added confusion I wanted to clear up.