microsoft / azure-tools-for-java

Azure tools for Java, including Azure Toolkits for Eclipse, IntelliJ and related projects.
Other
239 stars 161 forks source link

[IntelliJ][HDP3.0][Regression] Submit success, but in log show UNDEFINED status #2085

Closed jingyanjingyan closed 5 years ago

jingyanjingyan commented 6 years ago

Build: PR2087

Repro Steps:

  1. Submit sample.LogQuery to a HDP3.0 cluster "spark231HDI40-reg"

Result: Success but get log:

Package and deploy the job to Spark cluster INFO: Begin uploading file C:\Users\v-yajing\IdeaProjects\maventest1\out\artifacts\maventest1_DefaultArtifact\default_artifact.jar to Azure Blob Storage Account wasbs://spark231hdi40-reg-2018-09-25t00-57-59-798z@jillstormtest.blob.core.windows.net/SparkSubmission/2018/09/25/696d80cb-0210-4f69-95f2-735f1b2445e8/default_artifact.jar ... INFO: Submit file to azure blob 'wasbs://spark231hdi40-reg-2018-09-25t00-57-59-798z@jillstormtest.blob.core.windows.net/SparkSubmission/2018/09/25/696d80cb-0210-4f69-95f2-735f1b2445e8/default_artifact.jar' successfully. LOG: Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead. LOG: 18/09/25 06:30:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable LOG: Warning: Skip remote jar wasbs://spark231hdi40-reg-2018-09-25t00-57-59-798z@jillstormtest.blob.core.windows.net/SparkSubmission/2018/09/25/696d80cb-0210-4f69-95f2-735f1b2445e8/default_artifact.jar. LOG: 18/09/25 06:30:09 INFO AzureIaasSink: Init starting. Initializing MdsLogger. LOG: 18/09/25 06:30:09 INFO AzureIaasSink: Init completed. LOG: 18/09/25 06:30:09 INFO WasbAzureIaasSink: Init completed. LOG: 18/09/25 06:30:09 INFO MetricsSinkAdapter: Sink azurefs2 started LOG: 18/09/25 06:30:09 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 60 second(s). LOG: 18/09/25 06:30:09 INFO MetricsSystemImpl: azure-file-system metrics system started LOG: 18/09/25 06:30:09 INFO RequestHedgingRMFailoverProxyProvider: Created wrapped proxy for [rm1, rm2] LOG: 18/09/25 06:30:09 INFO RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]... LOG: 18/09/25 06:30:10 INFO RequestHedgingRMFailoverProxyProvider: Found active RM [rm2] LOG: 18/09/25 06:30:10 INFO Client: Requesting a new application from cluster with 2 NodeManagers LOG: 18/09/25 06:30:10 INFO Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.2.0-50/0/resource-types.xml LOG: 18/09/25 06:30:10 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (51200 MB per container) LOG: 18/09/25 06:30:10 INFO Client: Will allocate AM container, with 4480 MB memory including 384 MB overhead LOG: 18/09/25 06:30:10 INFO Client: Setting up container launch context for our AM LOG: 18/09/25 06:30:10 INFO Client: Setting up the launch environment for our AM container LOG: 18/09/25 06:30:10 INFO Client: Preparing resources for our AM container LOG: 18/09/25 06:30:14 INFO Client: Uploading resource file:/tmp/spark-77faa89c-b866-4ccd-9e79-bc237e2a9f82/spark_conf3471624071982603036.zip -> wasb://spark231hdi40-reg-2018-09-25t00-57-59-798z@jillstormtest.blob.core.windows.net/user/livy/.sparkStaging/application_1537837745842_0021/spark_conf.zip LOG: 18/09/25 06:30:15 INFO SecurityManager: Changing view acls groups to: LOG: 18/09/25 06:30:15 INFO SecurityManager: Changing modify acls groups to: LOG: 18/09/25 06:30:15 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(livy); groups with view permissions: Set(); users with modify permissions: Set(livy); groups with modify permissions: Set() LOG: 18/09/25 06:30:15 INFO Client: Submitting application application_1537837745842_0021 to ResourceManager LOG: 18/09/25 06:30:15 INFO YarnClientImpl: Submitted application application_1537837745842_0021 LOG: 18/09/25 06:30:15 INFO Client: Application report for application_1537837745842_0021 (state: ACCEPTED) LOG: 18/09/25 06:30:15 INFO Client: LOG: client token: N/A LOG: diagnostics: AM container is launched, waiting for AM container to Register with RM LOG: ApplicationMaster host: N/A LOG: ApplicationMaster RPC port: -1 LOG: queue: default LOG: start time: 1537857015192 LOG: final status: UNDEFINED LOG: tracking URL: http://hn1-spark2.5usshxhselhuhopisc4j3mqykc.cx.internal.cloudapp.net:8088/proxy/application_1537837745842_0021/ LOG: user: livy LOG: 18/09/25 06:30:15 INFO ShutdownHookManager: Shutdown hook called LOG: 18/09/25 06:30:15 INFO ShutdownHookManager: Deleting directory /tmp/spark-77faa89c-b866-4ccd-9e79-bc237e2a9f82 LOG: 18/09/25 06:30:15 INFO ShutdownHookManager: Deleting directory /tmp/spark-c85ae159-1ecd-4e07-a944-77f1642910e7 LOG: 18/09/25 06:30:15 INFO MetricsSystemImpl: Stopping azure-file-system metrics system... LOG: 18/09/25 06:30:15 INFO MetricsSinkAdapter: azurefs2 thread interrupted. LOG: 18/09/25 06:30:15 INFO MetricsSystemImpl: azure-file-system metrics system stopped. LOG: 18/09/25 06:30:15 INFO MetricsSystemImpl: azure-file-system metrics system shutdown complete.

wezhang commented 6 years ago

I suppose the log:

LOG: 18/09/25 06:30:15 INFO Client: LOG: client token: N/A LOG: diagnostics: AM container is launched, waiting for AM container to Register with RM LOG: ApplicationMaster host: N/A LOG: ApplicationMaster RPC port: -1 LOG: queue: default LOG: start time: 1537857015192 LOG: final status: UNDEFINED LOG: tracking URL: http://hn1-spark2.5usshxhselhuhopisc4j3mqykc.cx.internal.cloudapp.net:8088/proxy/application_1537837745842_0021/ LOG: user: livy

are yarn status output, @lcadzy could you help to make sure that?

v-jiche commented 5 years ago

Still Repro Build: azure-toolkit-for-intellij-2018.2.develop.850.10-31-2018

jingyanjingyan commented 5 years ago

Verify with private build 2461 as fixed.