cloudera / cloudera-playbook

Cloudera deployment automation with Ansible
Apache License 2.0
198 stars 187 forks source link

Cluster Template Cannot Be Imported #61

Closed Aidan-OS closed 4 years ago

Aidan-OS commented 4 years ago

I have been slowly chugging through getting this repo to work for me, and I think I have finally hit a problem I don't know how to solve whatsoever myself. I am setting up with an SCM server, a DB server, an edge server, 2 name nodes, and 4 data nodes (different from readme by lack of a 3rd name node and no KRB5) This is the error I am getting:

TASK [cdh : Wait for import cluster template command to complete] ****************************************************************************************************************************************** FAILED - RETRYING: Wait for import cluster template command to complete (10 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (9 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (8 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (7 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (6 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (5 retries left). FAILED - RETRYING: Wait for import cluster template command to complete (4 retries left). fatal: [scm-server.eastus.companyname.com]: FAILED! => {"attempts": 8, "changed": false, "connection": "close", "content": "{\n \"id\" : 34,\n \"name\" : \"ClusterTemplateImport\",\n \"startTime\" : \"2020-01-17T00:20:36.163Z\",\n \"endTime\" : \"2020-01-17T00:27:04.770Z\",\n \"active\" : false,\n \"success\" : false,\n \"resultMessage\" : \"Failed to import cluster template.\",\n \"children\" : {\n \"items\" : [ {\n \"id\" : 46,\n \"name\" : \"First Run\",\n \"startTime\" : \"2020-01-17T00:27:03.941Z\",\n \"endTime\" : \"2020-01-17T00:27:04.765Z\",\n \"active\" : false,\n \"success\" : false,\n \"resultMessage\" : \"Failed to perform First Run of services.\"\n }, {\n \"id\" : 36,\n \"name\" : \"DeployParcels\",\n \"startTime\" : \"2020-01-17T00:20:36.446Z\",\n \"endTime\" : \"2020-01-17T00:27:00.061Z\",\n \"active\" : false,\n \"success\" : true,\n \"resultMessage\" : \"The Following parcels successfully activated : CDH-6.3.2-1.cdh6.3.2.p0.1605554.\",\n \"clusterRef\" : {\n \"clusterName\" : \"cluster_1\",\n \"displayName\" : \"cluster_1\"\n }\n } ]\n },\n \"canRetry\" : true\n}", "content_type": "application/json;charset=utf-8", "cookies": {"CLOUDERA_MANAGER_SESSIONID": "node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0"}, "cookies_string": "CLOUDERA_MANAGER_SESSIONID=node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0", "date": "Fri, 17 Jan 2020 00:27:43 GMT", "elapsed": 0, "expires": "Thu, 01 Jan 1970 00:00:00 GMT", "failed_when_result": true, "json": {"active": false, "canRetry": true, "children": {"items": [{"active": false, "endTime": "2020-01-17T00:27:04.765Z", "id": 46, "name": "First Run", "resultMessage": "Failed to perform First Run of services.", "startTime": "2020-01-17T00:27:03.941Z", "success": false}, {"active": false, "clusterRef": {"clusterName": "cluster_1", "displayName": "cluster_1"}, "endTime": "2020-01-17T00:27:00.061Z", "id": 36, "name": "DeployParcels", "resultMessage": "The Following parcels successfully activated : CDH-6.3.2-1.cdh6.3.2.p0.1605554.", "startTime": "2020-01-17T00:20:36.446Z", "success": true}]}, "endTime": "2020-01-17T00:27:04.770Z", "id": 34, "name": "ClusterTemplateImport", "resultMessage": "Failed to import cluster template.", "startTime": "2020-01-17T00:20:36.163Z", "success": false}, "msg": "OK (unknown bytes)", "redirected": false, "set_cookie": "CLOUDERA_MANAGER_SESSIONID=node0lu3fbcycum4o1wjvg7w5o2rjl16974.node0;Path=/;HttpOnly", "status": 200, "url": "http://scm-server.eastus.companyname.com:7180/api/v33/commands/34", "x_content_type_options": "nosniff", "x_frame_options": "DENY", "x_xss_protection": "1; mode=block"}

Yes, that is what it gives, \n's and all. Reading through it, I cannot figure out what is going wrong. The cloudera server is running, however I do see a some errors when looking through the log file (these happen 100's of lines apart, condensed for reading ):

2020-01-17 00:27:02,724 ERROR scm-web-475:com.cloudera.cmf.service.AbstractRoleHandler: Unable to generate configuration for GATEWAY base group 2020-01-17 00:27:02,725 WARN scm-web-475:com.cloudera.server.cmf.descriptor.components.DescriptorFactory: Could not generate client configs for service: YARN (MR2 Included) java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:27:02,848 WARN scm-web-475:com.cloudera.server.cmf.descriptor.components.DescriptorFactory: Could not generate client configs for service: Hive java.lang.RuntimeException: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:31:43,798 INFO scm-web-494:com.cloudera.api.ApiExceptionMapper: Exception caught in API invocation. Msg:Role does not have a process. java.util.NoSuchElementException: Role does not have a process.

2020-01-17 00:31:43,885 WARN scm-web-494:com.cloudera.server.cmf.components.OperationsManagerImpl: Exception while building client config: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

2020-01-17 00:31:43,888 WARN scm-web-494:com.cloudera.api.ApiExceptionMapper: Unexpected exception. Msg:java.lang.IllegalStateException: Failed to create client configuration for service yarn java.lang.RuntimeException: java.lang.IllegalStateException: Failed to create client configuration for service yarn

2020-01-17 00:31:43,973 WARN scm-web-476:com.cloudera.server.cmf.components.OperationsManagerImpl: Exception while building client config: java.lang.RuntimeException: java.lang.RuntimeException: com.cloudera.cmf.service.config.ConfigGenException: Could not find JOBHISTORY dependent role

The list of these goes on for about another 30 errors, all mentioning JOBHISTORY. What am I doing wrong for this to occur? Do I need to be running the third name node?

Aidan-OS commented 4 years ago

3 name nodes are apparently mandatory.