zdata-inc / ambari-extensions

zData Ambari Stack containing HAWQ, Chorus, and Greenplum
http://zdata-inc.github.io/ambari-extensions
11 stars 5 forks source link

Clicking "Restart All" while the cluster is already stopped fails and blows up #33

Closed bdelamotte closed 9 years ago

bdelamotte commented 9 years ago

Log where it breaks

stderr:   /var/lib/ambari-agent/data/errors-85.txt

2015-06-10 19:13:54,272 - Error while executing command 'restart':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute
    method(env)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 232, in restart
    self.stop(env)
  File "/var/lib/ambari-agent/cache/stacks/PHD/9.9.9.zData/services/GREENPLUM/package/scripts/master.py", line 51, in stop
    user=params.admin_user
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 241, in action_run
    raise ex
Fail: Execution of 'gpstop -a -M smart -v' returned 2. 20150610:19:13:54:030140 gpstop:master:gpadmin-[INFO]:-Starting gpstop with args: -a -M smart -v
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Setting level of parallelism to: 64
20150610:19:13:54:030140 gpstop:master:gpadmin-[INFO]:-Gathering information and validating the environment...
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if GPHOME env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if MASTER_DATA_DIRECTORY env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if LOGNAME or USER env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:---Checking that current user can use GP binaries
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Obtaining master's port from master data directory
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Read from postgresql.conf port=6543
20150610:19:13:54:030140 gpstop:master:gpadmin-[ERROR]:-gpstop error: postmaster.pid file does not exist.  is Greenplum instance already stopped?
stdout:   /var/lib/ambari-agent/data/output-85.txt

2015-06-10 19:13:53,820 - Could not verify stack version by calling '/usr/bin/distro-select versions > /tmp/tmp_ik32J'. Return Code: 1, Output: .
2015-06-10 19:13:53,824 - Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/;     curl -kf -x "" --retry 10     http://master.ambaricluster.local:8080/resources//UnlimitedJCEPolicyJDK7.zip -o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] {'environment': ..., 'not_if': 'test -e /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip', 'ignore_failures': True, 'path': ['/bin', '/usr/bin/']}
2015-06-10 19:13:53,838 - Skipping Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/;     curl -kf -x "" --retry 10     http://master.ambaricluster.local:8080/resources//UnlimitedJCEPolicyJDK7.zip -o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] due to not_if
2015-06-10 19:13:53,839 - Group['hadoop'] {'ignore_failures': False}
2015-06-10 19:13:53,840 - Modifying group hadoop
2015-06-10 19:13:53,854 - Group['nobody'] {'ignore_failures': False}
2015-06-10 19:13:53,854 - Modifying group nobody
2015-06-10 19:13:53,864 - Group['nagios'] {'ignore_failures': False}
2015-06-10 19:13:53,864 - Modifying group nagios
2015-06-10 19:13:53,877 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'nobody']}
2015-06-10 19:13:53,877 - Modifying user nobody
2015-06-10 19:13:53,888 - User['nagios'] {'gid': 'nagios', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-06-10 19:13:53,888 - Modifying user nagios
2015-06-10 19:13:53,898 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
2015-06-10 19:13:53,898 - Modifying user ambari-qa
2015-06-10 19:13:53,911 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-06-10 19:13:53,912 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 2>/dev/null'] {'not_if': 'test $(id -u ambari-qa) -gt 1000'}
2015-06-10 19:13:53,920 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 2>/dev/null'] due to not_if
2015-06-10 19:13:53,932 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if': 'test -f /selinux/enforce'}
2015-06-10 19:13:54,130 - Execute['gpstop -a -M smart -v'] {'user': 'gpadmin'}
2015-06-10 19:13:54,272 - Error while executing command 'restart':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute
    method(env)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 232, in restart
    self.stop(env)
  File "/var/lib/ambari-agent/cache/stacks/PHD/9.9.9.zData/services/GREENPLUM/package/scripts/master.py", line 51, in stop
    user=params.admin_user
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action
    provider_action()
"""  
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 241, in action_run
    raise ex
Fail: Execution of 'gpstop -a -M smart -v' returned 2. 20150610:19:13:54:030140 gpstop:master:gpadmin-[INFO]:-Starting gpstop with args: -a -M smart -v
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Setting level of parallelism to: 64
20150610:19:13:54:030140 gpstop:master:gpadmin-[INFO]:-Gathering information and validating the environment...
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if GPHOME env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if MASTER_DATA_DIRECTORY env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Checking if LOGNAME or USER env variable is set.
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:---Checking that current user can use GP binaries
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Obtaining master's port from master data directory
20150610:19:13:54:030140 gpstop:master:gpadmin-[DEBUG]:-Read from postgresql.conf port=6543
20150610:19:13:54:030140 gpstop:master:gpadmin-[ERROR]:-gpstop error: postmaster.pid file does not exist.  is Greenplum instance already stopped?
jess-sol commented 9 years ago

This is not an issue with Ambari 2.0.0 and on, it seems to automatically skip trying to stop the service if it is already stopped.

@bdelamotte Should be fixed with commit eb04a83a. To be tested.