eclipse-ee4j / glassfish

Eclipse GlassFish
https://eclipse-ee4j.github.io/glassfish/
386 stars 144 forks source link

failed to start a cluster with password #18927

Closed glassfishrobot closed 12 years ago

glassfishrobot commented 12 years ago

ogs-4.0-b45.zip

1) sqe-domain is created with password file, adminpassword.txt AS_ADMIN_PASSWORD=adminadmin AS_ADMIN_MASTERPASSWORD=changeit AS_ADMIN_USERPASSWORD=secret 2) sqe-cluster is created with 3 local instances, clustered_instance_1, 2, 3. 3) The cluster failed to start with error "Authentication failed for user: null"

The cluster started fine with password on b38. The cluster started fine with empty password on b45.

Environment

SESE10, JDK1.6.0_30

Affected Versions

[4.0_dev]

glassfishrobot commented 6 years ago
glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: [1] start cluster with password on b45. http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/17/artifact/cli.log http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/17/console

start-cluster-common: [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt --interactive=false --echo=true --terse=false start-cluster --verbose=false sqe-cluster [exec] Command start-cluster failed. [exec] remote failure: clustered_instance_1: Could not start instance clustered_instance_1 on node localhost-sqe-domain (localhost). [exec] [exec] Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed. [exec] [exec] CLI802 Synchronization failed for directory config, caused by: [exec] Authentication failed for user: null [exec] (Usually, this means invalid user name and/or password)

[2] start cluster with password on b38. http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/15/artifact/cli.log http://bigapp-oblade-10.us.oracle.com:1080/job/sherry-core-lc/15/console

glassfishrobot commented 12 years ago

@glassfishrobot Commented @tjquinno said: I'll take ownership of this, at least for the moment.

glassfishrobot commented 12 years ago

@glassfishrobot Commented @tjquinno said: Reassigning back to Tom.

Sherry, please read through this (long) entry because I have a question for you at the bottom that might help us understand this.

I cannot reproduce this specific problem, but it might be related to something else I found.

When I run these commands things work fine for me:

I created adminpassword.txt as Sherry described. I put it in my current default directory which is outside of the installed GlassFish directory tree.

Note that the next batch of commands all specify --passwordfile adminpassword.txt on each command.

asadmin --passwordfile adminpassword.txt create-domain sqe-domain Enter admin user name [Enter to accept default "admin" / no password]> admin

asadmin start-domain sqe-domain

asadmin uptime # correctly prompts for credentials

asadmin --passwordfile adminpassword.txt uptime # works correctly with no promoting

asadmin --passwordfile adminpassword.txt create-cluster c1

asadmin --passwordfile adminpassword.txt create-local-instance --cluster c1 i1

asadmin --passwordfile adminpassword.txt start-instance i1

asadmin --passwordfile adminpassword.txt list-instances

asadmin --passwordfile adminpassword.txt stop-instance i1

asadmin --passwordfile adminpassword.txt start-cluster c1

asadmin --passwordfile adminpassword.txt list-instances i1

asadmin --passwordfile adminpassword.txt list-clusters

asadmin --passwordfile adminpassword.txt stop-cluster c1

asadmin --passwordfile adminpassword.txt delete-domain sqe-domain


Now, I run

export AS_ADMIN_PASSWORDFILE=adminpassword.txt # Note, no directory spec

and then ran these commands:

asadmin create-domain sqe-domain Enter admin user name [Enter to accept default "admin" / no password]> admin Using default port 4848 for Admin. Using default port 8080 for HTTP Instance. Using default port 7676 for JMS. Using default port 3700 for IIOP. Using default port 8181 for HTTP_SSL. Using default port 3820 for IIOP_SSL. Using default port 3920 for IIOP_MUTUALAUTH. Using default port 8686 for JMX_ADMIN. Using default port 6666 for OSGI_SHELL. Using default port 9009 for JAVA_DEBUGGER. Distinguished Name of the self-signed X.509 Server Certificate is: [CN=dhcp-whq-twvpn-1-vpnpool-10-159-222-141.vpn.oracle.com,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Distinguished Name of the self-signed X.509 Server Certificate is: [CN=dhcp-whq-twvpn-1-vpnpool-10-159-222-141.vpn.oracle.com-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] asadmin start-domain sqe-domain Domain.xml customization failed : com.sun.enterprise.module.bootstrap.BootException: Cannot find main module DomainCreation : no such module Domain sqe-domain created. Domain sqe-domain admin port is 4848. Domain sqe-domain admin user is "admin". Command create-domain executed successfully.

asadmin start-domain sqe-domain Waiting for sqe-domain to start .asadmin up.time ...... Successfully started the domain : sqe-domain domain Location: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain Log File: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain/logs/server.log Admin Port: 4848 Command start-domain executed successfully.

asadmin uptime Up 8 secs Command uptime executed successfully.

asadmin create-cluster c1 Command create-cluster executed successfully.

asadmin create-local-instance --cluster c1 i1 Rendezvoused with DAS on localhost:4848. Port Assignments for server instance i1: JMX_SYSTEM_CONNECTOR_PORT=28686 JMS_PROVIDER_PORT=27676 HTTP_LISTENER_PORT=28080 ASADMIN_LISTENER_PORT=24848 JAVA_DEBUGGER_PORT=29009 IIOP_SSL_LISTENER_PORT=23820 IIOP_LISTENER_PORT=23700 OSGI_SHELL_TELNET_PORT=26666 HTTP_SSL_LISTENER_PORT=28181 IIOP_SSL_MUTUALAUTH_PORT=23920 Command create-local-instance executed successfully.

asadmin start-instance i1 remote failure: Could not start instance i1 on node localhost-sqe-domain (localhost).

Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed.

java.io.FileNotFoundException: /Users/tjquinn/asgroup/v3/J/publish/glassfish3/glassfish/domains/sqe-domain/config/adminpassword.txt (No such file or directory)

To complete this operation run the following command locally on host localhost from the GlassFish install location /Users/tjquinn/asgroup/v3/J/publish/glassfish3:

lib/nadmin start-local-instance --node localhost-sqe-domain --sync normal i1 Command start-instance failed.


With the environment variable set, it causes problems because, in my case, the adminpassword.txt file does not exist in the current directory when the spawned shell is running trying to start the instance.

Sherry, I wonder if maybe you have AS_ADMIN_PASSWORDFILE set in a way that refers to a valid file from the directory where you run the commands but points to an existing but invalid file when run from a spawned shell when the DAS is trying to start the instance?

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Thank Tim for the analysis.

1) AS_ADMIN_PASSWORDFILE is not used in sqe tests for b45 and b38. With using same password located at test workspace, the create-cluster failure on b45 did not show on b38. On b38. [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt --interactive=false --echo=true --terse=false start-cluster --verbose=false sqe-cluster [exec] Command start-cluster executed successfully.

2) If I use AS_ADMIN_PASSWORDFILE in env, e.g $ export AS_ADMIN_PASSWORDFILE=/root/.hudson/jobs/sherry-core-lc/workspace/appserver-sqe/build-config/adminpasswordfile.txt

the create-cluster failure on b45 is resolved.

If I don't use AS_ADMIN_PASSWORDFILE in env or with $ export AS_ADMIN_PASSWORDFILE=

the create-cluster failure is shown on b45.

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: After discussing with Tom, I did cli tests on glassfish-4.0-b46.zip. $ cat ./adminpasswordfile.txt AS_ADMIN_PASSWORD=adminadmin AS_ADMIN_MASTERPASSWORD=changeit AS_ADMIN_USERPASSWORD=secret $ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-domain sqe-domain $ asadmin --user admin --passwordfile ./adminpasswordfile.txt start-domain sqe-domain $ asadmin --user admin --passwordfile ./adminpasswordfile.txt version $ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-local-instance --cluster sqe-cluster clustered_instance_1 $ asadmin --user admin --passwordfile ./adminpasswordfile.txt start-cluster sqe-cluster

A. If export AS_ADMIN_PASSWORDFILE= or without using AS_ADMIN_PASSWORDFILE, the failure is observed on start-cluster.

remote failure: clustered_instance_1: Could not start instance clustered_instance_1 on node localhost-sqe-domain (localhost). Command failed on node localhost-sqe-domain (localhost): Command start-local-instance failed. CLI802 Synchronization failed for directory config, caused by: Authentication failed for user: null (Usually, this means invalid user name and/or password)

To complete this operation run the following command locally on host localhost from the GlassFish install location /root/.hudson/jobs/sherry-core-lc/workspace/glassfish3:

lib/nadmin start-local-instance --node localhost-sqe-domain --sync normal clustered_instance_1

The command start-instance failed for: clustered_instance_1 Command start-cluster failed.

B. If export AS_ADMIN_PASSWORDFILE=$SPS_HOME/build-config/adminpasswordfile.txt, the start-cluster works fine.

For both A and B, I noticed that sqe-domain is created with error of Domain.xml customization failed.

$ asadmin --user admin --passwordfile ./adminpasswordfile.txt create-domain sqe-domain Using default port 4848 for Admin Using default port 8080 for HTTP Instance. Using default port 7676 for JMS. Using default port 3700 for IIOP. Using default port 8181 for HTTP_SSL. Using default port 3820 for IIOP_SSL. Using default port 3920 for IIOP_MUTUALAUTH. Using default port 8686 for JMX_ADMIN. Using default port 6666 for OSGI_SHELL. Using default port 9009 for JAVA_DEBUGGER. Distinguished Name of the self-signed X.509 Server Certificate is: [CN=bigapp-oblade-10,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Distinguished Name of the self-signed X.509 Server Certificate is: [CN=bigapp-oblade-10-instance,OU=GlassFish,O=Oracle Corporation,L=Santa Clara,ST=California,C=US] Domain.xml customization failed : com.sun.enterprise.module.bootstrap.BootException: Cannot find main module DomainCreation : no such module Domain sqe-domain created. Domain sqe-domain admin port is 4848. Domain sqe-domain admin user is "admin". Command create-domain executed successfully.

glassfishrobot commented 12 years ago

@glassfishrobot Commented tmueller said: I've confirm the test case. Thanks for providing these details.

Here are some more observations.

This appears to be a problem only when starting the instance for the first time, and then only with start-cluster.

If the instance is started with start-instance (using the --passwordfile option) or start-local-instance, the instance starts. The instance can then be stopped, and start-cluster will work fine using the --passwordfile option. So this problem is related to how start-cluster is running start-local-instance the first time that an instance is started. Somehow setting the AS_ADMIN_PASSWORDFILE environment variable avoids the problem.

Note: the Domain.xml customization failed problem is unrelated to this issue. It is probably related to the recent HK2 changes.

glassfishrobot commented 12 years ago

@glassfishrobot Commented tmueller said: Fixed on the trunk in revision 55228.

The root cause of the problem is that the ClusterCommandHelper which is used by start-cluster to execute several commands on the cluster, was not passing the subject to the command.

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: verified the fix on ogs-4.0-b48.zip. Thanks Tom and Tim for the analysis and fix.

glassfishrobot commented 12 years ago

@glassfishrobot Commented Was assigned to tmueller

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA GLASSFISH-18927

glassfishrobot commented 12 years ago

@glassfishrobot Commented Reported by sherryshen

glassfishrobot commented 12 years ago

@glassfishrobot Commented Marked as fixed on Thursday, July 26th 2012, 12:32:14 pm