steven-matison / dfhz_hue_mpack

Install Hue with Ambari using this management pack.
8 stars 12 forks source link

successfully installed but HUE server stops after asking it to restart #4

Closed bayeslearner closed 4 years ago

bayeslearner commented 4 years ago

2020-08-05 01:58:10,940 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=3.1.4.0-315 -> 3.1.4.0-315 2020-08-05 01:58:10,983 - Using hadoop conf dir: /usr/hdp/3.1.4.0-315/hadoop/conf 2020-08-05 01:58:11,396 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=3.1.4.0-315 -> 3.1.4.0-315 2020-08-05 01:58:11,407 - Using hadoop conf dir: /usr/hdp/3.1.4.0-315/hadoop/conf 2020-08-05 01:58:11,410 - Skipping creation of User and Group as host is sys prepped or ignore_groupsusers_create flag is on 2020-08-05 01:58:11,411 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'create_parents': True, 'mode': 0775, 'cd_access': 'a'} 2020-08-05 01:58:11,413 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2020-08-05 01:58:11,416 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2020-08-05 01:58:11,417 - call['/var/lib/ambari-agent/tmp/changeUid.sh hbase'] {} 2020-08-05 01:58:11,435 - call returned (0, '1020') 2020-08-05 01:58:11,437 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase 1020'] {'not_if': '(test $(id -u hbase) -gt 1000) || (true)'} 2020-08-05 01:58:11,447 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase 1020'] due to not_if 2020-08-05 01:58:11,448 - Skipping setting dfs cluster admin and tez view acls as host is sys prepped 2020-08-05 01:58:11,448 - FS Type: HDFS 2020-08-05 01:58:11,448 - Directory['/etc/hadoop'] {'mode': 0755} 2020-08-05 01:58:11,481 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2020-08-05 01:58:11,483 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777} 2020-08-05 01:58:11,512 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'} 2020-08-05 01:58:11,530 - Skipping Execute[('setenforce', '0')] due to only_if 2020-08-05 01:58:11,531 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'} 2020-08-05 01:58:11,536 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'} 2020-08-05 01:58:11,537 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'cd_access': 'a'} 2020-08-05 01:58:11,538 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'} 2020-08-05 01:58:11,546 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'} 2020-08-05 01:58:11,549 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'} 2020-08-05 01:58:11,559 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644} 2020-08-05 01:58:11,578 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2020-08-05 01:58:11,579 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755} 2020-08-05 01:58:11,581 - File['/usr/hdp/3.1.4.0-315/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'} 2020-08-05 01:58:11,588 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644} 2020-08-05 01:58:11,595 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755} 2020-08-05 01:58:11,603 - Skipping unlimited key JCE policy check and setup since it is not required 2020-08-05 01:58:11,619 - Skipping stack-select on HUE because it does not exist in the stack-select package structure. 2020-08-05 01:58:12,033 - Using hadoop conf dir: /usr/hdp/3.1.4.0-315/hadoop/conf 2020-08-05 01:58:12,043 - Execute['ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9'] {'ignore_failures': True, 'user': 'hue'} 2020-08-05 01:58:12,228 - Skipping failure of Execute['ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9'] due to ignore_failures. Failure reason: Execution of 'ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9' returned 137. kill: sending signal to 27914 failed: Operation not permitted kill: sending signal to 27940 failed: No such process kill: sending signal to 27943 failed: No such process -bash: line 1: 27940 Done ps -ef 27941 | grep hue 27942 | grep -v grep 27943 | awk '{print $2}' 27944 Killed | xargs kill -9 2020-08-05 01:58:12,229 - File['/var/run/hue/hue-server.pid'] {'owner': 'hue', 'action': ['delete']} 2020-08-05 01:58:12,230 - Deleting File['/var/run/hue/hue-server.pid'] 2020-08-05 01:58:12,235 - Configure Hue Service 2020-08-05 01:58:12,236 - Directory['/var/log/hue'] {'owner': 'hue', 'create_parents': True, 'group': 'hue', 'mode': 0755, 'cd_access': 'a'} 2020-08-05 01:58:12,239 - Directory['/var/run/hue'] {'owner': 'hue', 'group': 'hue', 'create_parents': True, 'mode': 0755, 'cd_access': 'a'} 2020-08-05 01:58:12,240 - File['/var/log/hue/hue-install.log'] {'content': '', 'owner': 'hue', 'group': 'hue', 'mode': 0644} 2020-08-05 01:58:12,241 - Writing File['/var/log/hue/hue-install.log'] because contents don't match 2020-08-05 01:58:12,242 - File['/var/run/hue/hue-server.pid'] {'content': '', 'owner': 'hue', 'group': 'hue', 'mode': 0644} 2020-08-05 01:58:12,242 - Writing File['/var/run/hue/hue-server.pid'] because it doesn't exist 2020-08-05 01:58:12,243 - Changing owner for /var/run/hue/hue-server.pid from 0 to hue 2020-08-05 01:58:12,243 - Changing group for /var/run/hue/hue-server.pid from 0 to hue 2020-08-05 01:58:12,243 - Execute['find /var/lib/ambari-agent/cache/common-services/HUE/4.6.0/package -iname ".sh" | xargs chmod +x'] {} 2020-08-05 01:58:12,258 - HdfsResource['/user/hue'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/3.1.4.0-315/hadoop/bin', 'keytab': [EMPTY], 'dfs_type': 'HDFS', 'default_fs': 'true', 'user': 'hdfs', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': [EMPTY], 'recursive_chmod': True, 'owner': 'hue', 'hadoop_conf_dir': '/usr/hdp/3.1.4.0-315/hadoop/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/mr-history/done', u'/warehouse/tablespace/managed/hive', u'/warehouse/tablespace/external/hive', u'/app-logs', u'/tmp'], 'mode': 0755} 2020-08-05 01:58:12,268 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET -d '"'"''"'"' -H '"'"'Content-Length: 0'"'"' '"'"'http://c7401.ambari.apache.org:50070/webhdfs/v1/user/hue?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpMo0h9z 2>/tmp/tmpYlFv81''] {'logoutput': None, 'quiet': False} 2020-08-05 01:58:12,408 - call returned (0, '') 2020-08-05 01:58:12,410 - get_user_call_output returned (0, u'{"FileStatus":{"accessTime":0,"blockSize":0,"childrenNum":0,"fileId":148012,"group":"hdfs","length":0,"modificationTime":1596581549934,"owner":"hue","pathSuffix":"","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"}}200', u'') 2020-08-05 01:58:12,412 - Creating /usr/local/hue/desktop/conf/log.conf file 2020-08-05 01:58:12,420 - File['/usr/local/hue/desktop/conf/log.conf'] {'owner': 'hue', 'content': InlineTemplate(...)} 2020-08-05 01:58:12,421 - Creating /usr/local/hue/desktop/conf/hue.ini config file 2020-08-05 01:58:12,472 - File['/usr/local/hue/desktop/conf/hue.ini'] {'owner': 'hue', 'content': InlineTemplate(...)} 2020-08-05 01:58:12,474 - Run the script file to add configurations 2020-08-05 01:58:12,474 - Execute['/var/lib/ambari-agent/cache/common-services/HUE/4.6.0/package/files/configs.sh set c7401.ambari.apache.org hdp hdfs-site 'dfs.namenode.acls.enabled' 'true''] {} 2020-08-05 01:58:13,984 - Execute['/var/lib/ambari-agent/cache/common-services/HUE/4.6.0/package/files/configs.sh set c7401.ambari.apache.org hdp core-site 'hadoop.proxyuser.hue.groups' '''] {} 2020-08-05 01:58:15,168 - Execute['/var/lib/ambari-agent/cache/common-services/HUE/4.6.0/package/files/configs.sh set c7401.ambari.apache.org hdp core-site 'hadoop.proxyuser.hue.hosts' '*''] {} 2020-08-05 01:58:16,319 - Execute['/usr/local/hue/build/env/bin/supervisor >> /var/log/hue/hue-install.log 2>&1 &'] {'environment': {'SPARK_HOME': '/usr/hdp/current/spark-client', 'HADOOP_CONF_DIR': u'/usr/hdp/3.1.4.0-315/hadoop/conf', 'JAVA_HOME': u'/usr/jdk64/jdk1.8.0_112', 'PATH': '$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin'}, 'user': 'hue'} 2020-08-05 01:58:16,444 - Execute['ps -ef | grep hue | grep supervisor | grep -v grep | awk '{print $2}' > /var/run/hue/hue-server.pid'] {'user': 'hue'} 2020-08-05 01:58:16,601 - Pid files for current script are not defined 2020-08-05 01:58:16,649 - Skipping stack-select on HUE because it does not exist in the stack-select package structure.

Command completed successfully!

steven-matison commented 4 years ago

this looks like an issue with hue user and system permissions...

2020-08-05 01:58:12,043 - Execute['ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9'] {'ignore_failures': True, 'user': 'hue'}
2020-08-05 01:58:12,228 - Skipping failure of Execute['ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9'] due to ignore_failures. Failure reason: Execution of 'ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9' returned 137. 
**_kill: sending signal to 27914 failed: Operation not permitted_**
kill: sending signal to 27940 failed: No such process
kill: sending signal to 27943 failed: No such process

I would recommend confirm that you can fully stop hue on the node it is installed on. Next confirm the hue installation directories are fully chowned hue:hue. Make sure hue is added to sudo group. Then try to start again from ambari.

USER COMMANDS 19 groupadd hue 20 useradd -g hue hue 29 usermod -a -G wheel hue 30 chown -R hue:hue /home/hue

You may find more thorough support on stack overflow, cloudera community, or https://discourse.gethue.com

bayeslearner commented 4 years ago

How come this was not taken care of by the installation? Thanks.

bayeslearner commented 4 years ago

Didn't work with those changes made.

steven-matison commented 4 years ago

There are some known bugs with user:group management in hdp3. But more importantly depending on your system, what works out of the box, and in my demos, isn't exactly the same in every install (your demo). I have not seen the actual error you have in my experience installing hue with this mpack, but I have seen other services do similar. While installing hue from ambari or manually I have seen it need some manual touches to get thing started the first time. I have experience, where after a successful install too, I still have to manually verify and correct everything is tidy with the hue user and the hue app folders.

In your error you can see ambari says it is not able to kill the process ID for hue (if its running). If its not running you need to start it (with correct user/perms), and retry the ambari restart command. My suggestions in previous post were things to address that so hue can kill the hue process during a restart command from the ambari-agent.

To recap, when you have issues with service you have to fully shut down hue manually (kill the process via command line as root), get the hue user right for sudo, get the hue program user/perms right for the hue user, then sudo to the hue user and get hue running as the hue user (the command is in the Ambari-restart failure modal). After that you can re-try the ambari restart command that failed via ambari admin and it should work. Finally after all the above you should be able stop, restart service from ambari without conflict.

bayeslearner commented 4 years ago

Thanks for your patient answer. I was able to get hue started and get into the web interface. However I still have issues.

I checked the errors in hue-install.log, and it seems port 8888 is already bound, so I did the following:

Although HUE and its web interface is running, most connectors including file browsers are not working, so it may again be an issue with user permission.

For example, I see this: [05/Aug/2020 19:14:19 +0000] exceptions_renderable ERROR Potential detail: HTTPConnectionPool(host='localhost', port=50070): Max retries exceeded with url: /webhdfs/v1/user/admin?op=GETFILESTATUS&user.name=hue&doas=admin (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc46419e150>: Failed to establish a new connection: [Errno 111] Connection refused',))

For the HUE local admin account "admin", it does exist both at the OS level, as well as an ambari user, and as a HDFS user.
The user "hue" also exists at the OS account which password-less sudo. What do I need to do to let it impersonate "admin"?

steven-matison commented 4 years ago

Now we are getting somewhere, it appears the original issue was the port conflict. You can change the port to avoid that.

After install and start are solved, configuring the plugins is additional administrative work. The original HUE Service was built for Ambari 2. The modifications I made were bare minimum to get hue to install and start in my demos. I did ultimately deploy a working HUE to an HDP 3 Production cluster. The 4.6.0 hue configuration for that cluster with working yarn,hive,hdfs,etc is found here as example of working config:

https://github.com/steven-matison/HDP3-Hue-Service/blob/Hue.4.6.0/configuration/live.hue.ini

As you can see the hue configuration file is extensive. You will need to make the required modifications from ambari and restart hue to test their impact. My best recommendation is to start with comparing the live.hue.ini above and work on each plugin one at a time. Refer to the official gethue.com docuemtation, the get hue forum, and/or the cloudera community & stackoverflow for additional support on individual conflicts.