Closed glassfishrobot closed 12 years ago
@glassfishrobot Commented sherryshen said: [1] start cluster with password on b45. http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/17/artifact/cli.log http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/17/console
start-cluster-common: [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt --interactive=false --echo=true --terse=false start-cluster --verbose=false sqe-cluster [exec] Command start-cluster failed. [exec] remote failure: clustered_instance_1: Could not start instance clustered_instance_1 on node localhost-sqe-domain (localhost). [exec] [exec] Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed. [exec] [exec] CLI802 Synchronization failed for directory config, caused by: [exec] Authentication failed for user: null [exec] (Usually, this means invalid user name and/or password)
[2] start cluster with password on b38. http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/15/artifact/cli.log http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/15/console
@glassfishrobot Commented @tjquinno said: I'll take ownership of this, at least for the moment.
@glassfishrobot Commented @tjquinno said: Reassigning back to Tom.
Sherry, please read through this (long) entry because I have a question for you at the bottom that might help us understand this.
I cannot reproduce this specific problem, but it might be related to something else I found.
When I run these commands things work fine for me:
I created adminpassword.txt as Sherry described. I put it in my current default directory which is outside of the installed GlassFish directory tree.
Note that the next batch of commands all specify --passwordfile adminpassword.txt on each command.
asadmin --passwordfile adminpassword.txt create-domain sqe-domain Enter admin user name [Enter to accept default "admin" / no password]> admin
asadmin start-domain sqe-domain
asadmin uptime # correctly prompts for credentials
asadmin --passwordfile adminpassword.txt uptime # works correctly with no promoting
asadmin --passwordfile adminpassword.txt create-cluster c1
asadmin --passwordfile adminpassword.txt create-local-instance --cluster c1 i1
asadmin --passwordfile adminpassword.txt start-instance i1
asadmin --passwordfile adminpassword.txt list-instances
asadmin --passwordfile adminpassword.txt stop-instance i1
asadmin --passwordfile adminpassword.txt start-cluster c1
asadmin --passwordfile adminpassword.txt list-instances i1
asadmin --passwordfile adminpassword.txt list-clusters
asadmin --passwordfile adminpassword.txt stop-cluster c1
asadmin --passwordfile adminpassword.txt delete-domain sqe-domain
Now, I run
export AS_ADMIN_PASSWORDFILE=adminpassword.txt # Note, no directory spec
and then ran these commands:
asadmin create-domain sqe-domain Enter admin user name [Enter to accept default "admin" / no password]> admin Using default port 4848 for Admin. Using default port 8080 for HTTP Instance. Using default port 7676 for JMS. Using default port 3700 for IIOP. Using default port 8181 for HTTP_SSL. Using default port 3820 for IIOP_SSL. Using default port 3920 for IIOP_MUTUALAUTH. Using default port 8686 for JMX_ADMIN. Using default port 6666 for OSGI_SHELL. Using default port 9009 for JAVA_DEBUGGER. Distinguished Name of the self-signed X.509 Server Certificate is: [CN=dhcp-whq-twvpn-1-vpnpool-10-159-222-141.vpn.oracle.com,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Distinguished Name of the self-signed X.509 Server Certificate is: [CN=dhcp-whq-twvpn-1-vpnpool-10-159-222-141.vpn.oracle.com-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] asadmin start-domain sqe-domain Domain.xml customization failed : com.sun.enterprise.module.bootstrap.BootException: Cannot find main module DomainCreation : no such module Domain sqe-domain created. Domain sqe-domain admin port is 4848. Domain sqe-domain admin user is "admin". Command create-domain executed successfully.
asadmin start-domain sqe-domain Waiting for sqe-domain to start .asadmin up.time ...... Successfully started the domain : sqe-domain domain Location: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain Log File: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain/logs/server.log Admin Port: 4848 Command start-domain executed successfully.
asadmin uptime Up 8 secs Command uptime executed successfully.
asadmin create-cluster c1 Command create-cluster executed successfully.
asadmin create-local-instance --cluster c1 i1 Rendezvoused with DAS on localhost:4848. Port Assignments for server instance i1: JMX_SYSTEM_CONNECTOR_PORT=28686 JMS_PROVIDER_PORT=27676 HTTP_LISTENER_PORT=28080 ASADMIN_LISTENER_PORT=24848 JAVA_DEBUGGER_PORT=29009 IIOP_SSL_LISTENER_PORT=23820 IIOP_LISTENER_PORT=23700 OSGI_SHELL_TELNET_PORT=26666 HTTP_SSL_LISTENER_PORT=28181 IIOP_SSL_MUTUALAUTH_PORT=23920 Command create-local-instance executed successfully.
asadmin start-instance i1 remote failure: Could not start instance i1 on node localhost-sqe-domain (localhost).
Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed.
java.io.FileNotFoundException: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain/config/adminpassword.txt (No such file or directory)
To complete this operation run the following command locally on host localhost from the GlassFish install location /Users/tjquinn/asgroup/v3/J/publish/glassfish3:
lib/nadmin start-local-instance --node localhost-sqe-domain --sync normal i1 Command start-instance failed.
With the environment variable set, it causes problems because, in my case, the adminpassword.txt file does not exist in the current directory when the spawned shell is running trying to start the instance.
Sherry, I wonder if maybe you have AS_ADMIN_PASSWORDFILE set in a way that refers to a valid file from the directory where you run the commands but points to an existing but invalid file when run from a spawned shell when the DAS is trying to start the instance?
@glassfishrobot Commented sherryshen said: Thank Tim for the analysis.
1) AS_ADMIN_PASSWORDFILE is not used in sqe tests for b45 and b38. With using same password located at test workspace, the create-cluster failure on b45 did not show on b38. On b38. [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt --interactive=false --echo=true --terse=false start-cluster --verbose=false sqe-cluster [exec] Command start-cluster executed successfully.
2) If I use AS_ADMIN_PASSWORDFILE in env, e.g $ export AS_ADMIN_PASSWORDFILE=/root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt
the create-cluster failure on b45 is resolved.
If I don't use AS_ADMIN_PASSWORDFILE in env or with $ export AS_ADMIN_PASSWORDFILE=
the create-cluster failure is shown on b45.
@glassfishrobot Commented sherryshen said: After discussing with Tom, I did cli tests on glassfish-4.0-b46.zip. $ cat ./adminpasswordfile.txt AS_ADMIN_PASSWORD=adminadmin AS_ADMIN_MASTERPASSWORD=changeit AS_ADMIN_USERPASSWORD=secret $ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-domain sqe-domain $ asadmin --user admin --passwordfile ./adminpasswordfile.txt start-domain sqe-domain $ asadmin --user admin --passwordfile ./adminpasswordfile.txt version $ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-local-instance --cluster sqe-cluster clustered_instance_1 $ asadmin --user admin --passwordfile ./adminpasswordfile.txt start-cluster sqe-cluster
A. If export AS_ADMIN_PASSWORDFILE= or without using AS_ADMIN_PASSWORDFILE, the failure is observed on start-cluster.
remote failure: clustered_instance_1: Could not start instance clustered_instance_1 on node localhost-sqe-domain (localhost). Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed. CLI802 Synchronization failed for directory config, caused by: Authentication failed for user: null (Usually, this means invalid user name and/or password)
To complete this operation run the following command locally on host localhost from the GlassFish install location /root/.hudson/jobs/sherry-core-lc/workspace/glassfish3:
lib/nadmin start-local-instance --node localhost-sqe-domain --sync normal clustered_instance_1
The command start-instance failed for: clustered_instance_1 Command start-cluster failed.
B. If export AS_ADMIN_PASSWORDFILE=$SPS_HOME/build-config/adminpasswordfile.txt, the start-cluster works fine.
For both A and B, I noticed that sqe-domain is created with error of Domain.xml customization failed.
$ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-domain sqe-domain Using default port 4848 for Admin Using default port 8080 for HTTP Instance. Using default port 7676 for JMS. Using default port 3700 for IIOP. Using default port 8181 for HTTP_SSL. Using default port 3820 for IIOP_SSL. Using default port 3920 for IIOP_MUTUALAUTH. Using default port 8686 for JMX_ADMIN. Using default port 6666 for OSGI_SHELL. Using default port 9009 for JAVA_DEBUGGER. Distinguished Name of the self-signed X.509 Server Certificate is: [CN=bigapp-oblade-10,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Distinguished Name of the self-signed X.509 Server Certificate is: [CN=bigapp-oblade-10-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Domain.xml customization failed : com.sun.enterprise.module.bootstrap.BootException: Cannot find main module DomainCreation : no such module Domain sqe-domain created. Domain sqe-domain admin port is 4848. Domain sqe-domain admin user is "admin". Command create-domain executed successfully.
@glassfishrobot Commented tmueller said: I've confirm the test case. Thanks for providing these details.
Here are some more observations.
This appears to be a problem only when starting the instance for the first time, and then only with start-cluster.
If the instance is started with start-instance (using the --passwordfile option) or start-local-instance, the instance starts. The instance can then be stopped, and start-cluster will work fine using the --passwordfile option. So this problem is related to how start-cluster is running start-local-instance the first time that an instance is started. Somehow setting the AS_ADMIN_PASSWORDFILE environment variable avoids the problem.
Note: the Domain.xml customization failed problem is unrelated to this issue. It is probably related to the recent HK2 changes.
@glassfishrobot Commented tmueller said: Fixed on the trunk in revision 55228.
The root cause of the problem is that the ClusterCommandHelper which is used by start-cluster to execute several commands on the cluster, was not passing the subject to the command.
@glassfishrobot Commented sherryshen said: verified the fix on ogs-4.0-b48.zip. Thanks Tom and Tim for the analysis and fix.
@glassfishrobot Commented Was assigned to tmueller
@glassfishrobot Commented This issue was imported from java.net JIRA GLASSFISH-18927
@glassfishrobot Commented Reported by sherryshen
@glassfishrobot Commented Marked as fixed on Thursday, July 26th 2012, 12:32:14 pm
ogs-4.0-b45.zip
1) sqe-domain is created with password file, adminpassword.txt AS_ADMIN_PASSWORD=adminadmin AS_ADMIN_MASTERPASSWORD=changeit AS_ADMIN_USERPASSWORD=secret 2) sqe-cluster is created with 3 local instances, clustered_instance_1, 2, 3. 3) The cluster failed to start with error "Authentication failed for user: null"
The cluster started fine with password on b38. The cluster started fine with empty password on b45.
Environment
SESE10, JDK1.6.0_30
Affected Versions
[4.0_dev]