alibaba / x-deeplearning

An industrial deep learning framework for high-dimension sparse data
Apache License 2.0
4.25k stars 1.03k forks source link

分布式提交报错 #208

Open mengyiliu22 opened 5 years ago

mengyiliu22 commented 5 years ago

container日志报错信息如下:

2019-04-24 18:17:01,747 INFO xdl.ContainerBase: pull image [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] 2019-04-24 18:17:02,533 INFO xdl.ContainerBase: cmd [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] cost 786 ms status [0] 2019-04-24 18:17:02,534 INFO xdl.ContainerBase: pull image cost 787 ms 2019-04-24 18:17:04,165 INFO xdl.ContainerRunner$ContainerSignalHandler: Container is killed by signal:15. 2019-04-24 18:17:04,165 INFO xdl.ContainerBase: stop container cmd [docker stop -t 30 xdl_application_1556099741086_0003_scheduler_0_000002]

以上没有明确报错信息,只提示killed,请教下是什么原因呢?

mengyiliu22 commented 5 years ago

@songyue1104 @yiling-dc

yiling-dc commented 5 years ago

container日志报错信息如下:

2019-04-24 18:17:01,747 INFO xdl.ContainerBase: pull image [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] 2019-04-24 18:17:02,533 INFO xdl.ContainerBase: cmd [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] cost 786 ms status [0] 2019-04-24 18:17:02,534 INFO xdl.ContainerBase: pull image cost 787 ms 2019-04-24 18:17:04,165 INFO xdl.ContainerRunner$ContainerSignalHandler: Container is killed by signal:15. 2019-04-24 18:17:04,165 INFO xdl.ContainerBase: stop container cmd [docker stop -t 30 xdl_application_1556099741086_0003_scheduler_0_000002]

以上没有明确报错信息,只提示killed,请教下是什么原因呢?

yarn container被kill的原因有很多种,你可以从appMaster看看有没有线索。比如container被调度到了unhealthy的节点上也有可能被kill的。

mengyiliu22 commented 5 years ago

container日志报错信息如下: 2019-04-24 18:17:01,747 INFO xdl.ContainerBase: pull image [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] 2019-04-24 18:17:02,533 INFO xdl.ContainerBase: cmd [docker pull registry.cn-hangzhou.aliyuncs.com/xdl/xdl:ubuntu-cpu-mxnet1.3] cost 786 ms status [0] 2019-04-24 18:17:02,534 INFO xdl.ContainerBase: pull image cost 787 ms 2019-04-24 18:17:04,165 INFO xdl.ContainerRunner$ContainerSignalHandler: Container is killed by signal:15. 2019-04-24 18:17:04,165 INFO xdl.ContainerBase: stop container cmd [docker stop -t 30 xdl_application_1556099741086_0003_scheduler_0_000002] 以上没有明确报错信息,只提示killed,请教下是什么原因呢?

yarn container被kill的原因有很多种,你可以从appMaster看看有没有线索。比如container被调度到了unhealthy的节点上也有可能被kill的。

这是am日志,帮忙看下会是什么原因

2019-05-05 17:16:56,086 INFO conf.Configuration: found resource resource-types.xml at file:/opt/hadoop/hadoop-3.1.2/etc/hadoop/resource-types.xml 2019-05-05 17:16:56,142 INFO resource.ResourceUtils: Adding resource type - name = yarn.io/gpu, units = , type = COUNTABLE 2019-05-05 17:16:56,164 INFO Configuration.deprecation: yarn.resourcemanager.zk-address is deprecated. Instead, use hadoop.zk.address 2019-05-05 17:16:56,174 INFO xdl.AppMasterBase: Zookeeper connect string is:[master:2181,worker1:2181] 2019-05-05 17:16:56,249 INFO imps.CuratorFrameworkImpl: Starting 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 GMT 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:host.name=worker1 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_45 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:java.home=/home/work/.jumbo/opt/sun-java8/jre 2019-05-05 17:16:56,267 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/opt/hadoop/hadoop-3.1.2/etc/hadoop:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-kms-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-nfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-lang3-3.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-server-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-net-3.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/asm-5.0.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/zookeeper-3.4.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/httpclient-4.5.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/hadoop-annotations-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-admin-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsp-api-2.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-util-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-common-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-xml-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-io-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-databind-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-pkix-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/json-smart-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/audience-annotations-0.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-json-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/avro-1.7.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-asn1-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-config-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/re2j-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-codec-1.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-framework-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-server-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/snappy-java-1.0.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-xdr-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/guava-11.0.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/accessors-smart-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-client-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-servlet-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-api-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-core-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-beanutils-1.9.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-security-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/httpcore-4.4.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-io-2.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-lang-2.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/nimbus-jose-jwt-4.41.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsch-0.1.54.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-recipes-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/javax.servlet-api-3.1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/log4j-1.2.17.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jul-to-slf4j-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/paranamer-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/hadoop-auth-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jettison-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-server-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/gson-2.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/netty-3.10.5.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/metrics-core-3.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-client-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-webapp-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/token-provider-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-servlet-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-http-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-configuration2-2.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-identity-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-compress-1.18.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/stax2-api-3.1.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-core-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-nfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-httpfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang3-3.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-server-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-net-3.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/asm-5.0.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/zookeeper-3.4.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-ajax-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/httpclient-4.5.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-annotations-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/json-simple-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-math3-3.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-admin-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-common-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-xml-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-io-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-pkix-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/json-smart-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/audience-annotations-0.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-json-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr311-api-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/avro-1.7.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-asn1-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-config-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/re2j-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-crypto-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jcip-annotations-1.0-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-codec-1.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-framework-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-server-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/snappy-java-1.0.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-xdr-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-xc-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/accessors-smart-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-client-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-servlet-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/okio-1.6.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-core-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/woodstox-core-5.0.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-beanutils-1.9.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-security-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/httpcore-4.4.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-io-2.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-all-4.0.52.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/nimbus-jose-jwt-4.41.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsch-0.1.54.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-recipes-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/javax.servlet-api-3.1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/paranamer-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-auth-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jettison-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-server-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/gson-2.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-api-2.2.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-3.10.5.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-client-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-webapp-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/token-provider-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-servlet-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-http-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-configuration2-2.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-identity-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-compress-1.18.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/stax2-api-3.1.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-simplekdc-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-core-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-collections-3.2.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-api-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-registry-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-api-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-core-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-router-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-base-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/java-util-1.9.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/javax.inject-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/dnsjava-2.1.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/guice-4.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/fst-2.50.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/json-io-2.5.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/snakeyaml-1.16.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/objenesis-1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/guice-servlet-4.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/metrics-core-3.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-json-provider-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-module-jaxb-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/swagger-annotations-1.5.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-guice-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-client-1.19.jar:/var/lib/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1556600737538_0007/container_1556600737538_0007_01_000001/xdl-yarn-scheduler-1.0.0-SNAPSHOT-jar-with-dependencies.jar 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:java.compiler= 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0_3-0-0-20 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:user.name=root 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:user.home=/root 2019-05-05 17:16:56,268 INFO zookeeper.ZooKeeper: Client environment:user.dir=/var/lib/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1556600737538_0007/container_1556600737538_0007_01_000001 2019-05-05 17:16:56,269 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=master:2181,worker1:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@6aeb35e6 2019-05-05 17:16:56,286 INFO zookeeper.ClientCnxn: Opening socket connection to server master/10.216.45.28:2181. Will not attempt to authenticate using SASL (unknown error) 2019-05-05 17:16:56,292 INFO zookeeper.ClientCnxn: Socket connection established to master/10.216.45.28:2181, initiating session 2019-05-05 17:16:56,298 INFO zookeeper.ClientCnxn: Session establishment complete on server master/10.216.45.28:2181, sessionid = 0x65045cd626be0070, negotiated timeout = 40000 2019-05-05 17:16:56,304 INFO state.ConnectionStateManager: State change: CONNECTED 2019-05-05 17:16:56,336 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2019-05-05 17:16:57,283 INFO xdl.AppMasterBase: ResourceManager client started. 2019-05-05 17:16:57,297 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 2019-05-05 17:16:57,399 INFO xdl.AppMasterBase: ApplicationMaster max memory [204800] max cpu cores [28] max gpu cores [0] 2019-05-05 17:16:57,399 INFO xdl.AppMasterBase: Register ApplicationMaster success. 2019-05-05 17:16:57,402 INFO xdl.AppMasterBase: NodeManager client started. 2019-05-05 17:16:57,403 INFO xdl.AppMasterBase: change work memory to 775 2019-05-05 17:16:57,403 INFO xdl.AppMasterBase: Worker container capability:[CPU: 4, GPU: 0, Memory: 775 MB 2019-05-05 17:16:57,403 INFO xdl.AppMasterBase: change ps memory to 775 2019-05-05 17:16:57,403 INFO xdl.AppMasterBase: PS container capability:[CPU: 4, GPU: 0, Memory: 775 MB 2019-05-05 17:16:57,403 INFO xdl.AppMasterBase: App:[application_1556600737538_0007] has [1] ps job and [1] worker job 2019-05-05 17:16:57,426 INFO xdl.AppMasterBase: hdfs://lmy-hdfs/user/root/.xdl/application_1556600737538_0007/config.tree_init.json 2019-05-05 17:16:57,452 INFO xdl.AppMasterBase: hdfs://lmy-hdfs/user/root/.xdl/application_1556600737538_0007/tdm_mock.tar.gz 2019-05-05 17:16:57,454 INFO xdl.AppMasterBase: hdfs://lmy-hdfs/user/root/.xdl/application_1556600737538_0007/xdl-yarn-scheduler-1.0.0-SNAPSHOT-jar-with-dependencies.jar 2019-05-05 17:16:57,456 INFO xdl.AppMasterBase: JAVA CLASS_PATH is /opt/hadoop/hadoop-3.1.2/etc/hadoop:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-kms-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-nfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-lang3-3.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-server-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-net-3.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/asm-5.0.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/zookeeper-3.4.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/httpclient-4.5.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/hadoop-annotations-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-admin-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsp-api-2.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-util-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-common-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-xml-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-io-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-databind-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-pkix-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/json-smart-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/audience-annotations-0.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-json-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/avro-1.7.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-asn1-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-config-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/re2j-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-codec-1.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-framework-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-server-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/snappy-java-1.0.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-xdr-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/guava-11.0.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/accessors-smart-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-client-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-servlet-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-api-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-core-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-beanutils-1.9.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-security-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/httpcore-4.4.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-io-2.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-lang-2.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/nimbus-jose-jwt-4.41.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jsch-0.1.54.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/curator-recipes-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/javax.servlet-api-3.1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/log4j-1.2.17.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jul-to-slf4j-1.7.25.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/paranamer-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/hadoop-auth-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jettison-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-server-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/gson-2.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/netty-3.10.5.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/metrics-core-3.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-client-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerby-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-webapp-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/token-provider-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-servlet-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jetty-http-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-configuration2-2.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-identity-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-compress-1.18.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/stax2-api-3.1.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jersey-core-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-nfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2-tests.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-httpfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang3-3.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-server-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-net-3.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/asm-5.0.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/zookeeper-3.4.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-ajax-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/httpclient-4.5.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-annotations-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/json-simple-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-math3-3.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-admin-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-common-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-xml-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-io-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-pkix-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/json-smart-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/audience-annotations-0.5.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-json-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr311-api-1.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/avro-1.7.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-asn1-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-config-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/re2j-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-crypto-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jcip-annotations-1.0-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-codec-1.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-framework-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-server-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/snappy-java-1.0.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-xdr-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-xc-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/accessors-smart-1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-client-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-servlet-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/okio-1.6.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-core-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/woodstox-core-5.0.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-beanutils-1.9.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-security-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/httpcore-4.4.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-io-2.5.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-all-4.0.52.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/nimbus-jose-jwt-4.41.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jsch-0.1.54.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-recipes-2.13.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/javax.servlet-api-3.1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/paranamer-2.3.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-auth-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jettison-1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-server-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/gson-2.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-api-2.2.11.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-3.10.5.Final.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-client-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-util-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-webapp-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/token-provider-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-servlet-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-http-9.3.24.v20180605.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-configuration2-2.1.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-identity-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-compress-1.18.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/stax2-api-3.1.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-simplekdc-1.0.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-core-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-collections-3.2.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-api-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-registry-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-api-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-core-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-router-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-common-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-client-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.2.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-base-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/java-util-1.9.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/javax.inject-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/dnsjava-2.1.7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/guice-4.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/fst-2.50.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/json-io-2.5.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/snakeyaml-1.16.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/objenesis-1.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/guice-servlet-4.0.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/metrics-core-3.2.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-json-provider-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-module-jaxb-annotations-2.7.8.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/swagger-annotations-1.5.4.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-guice-1.19.jar:/opt/hadoop/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-client-1.19.jar:/var/lib/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1556600737538_0007/container_1556600737538_0007_01_000001/xdl-yarn-scheduler-1.0.0-SNAPSHOT-jar-with-dependencies.jar 2019-05-05 17:16:57,670 INFO xdl.AppMasterBase: request 1 scheduler containers 2019-05-05 17:16:57,679 INFO xdl.AppMasterBase: Making resource request for scheduler container 0 2019-05-05 17:16:57,736 INFO xdl.AppMasterBase: finish request 0 scheduler containers 2019-05-05 17:16:57,938 INFO xdl.AppMasterBase: finish request 0 scheduler containers 2019-05-05 17:16:58,139 INFO xdl.AppMasterBase: finish request 0 scheduler containers 2019-05-05 17:16:58,352 INFO xdl.AppMasterBase: finish request 1 scheduler containers 2019-05-05 17:16:58,552 INFO xdl.AppMasterBase: finish request all scheduler containers 2019-05-05 17:16:58,552 INFO xdl.AppMasterBase: request 1 ps containers 2019-05-05 17:16:58,552 INFO xdl.AppMasterBase: Making resource request for ps container 0 2019-05-05 17:16:58,554 INFO xdl.AppMasterBase: finish request 0 ps containers 2019-05-05 17:16:58,756 INFO xdl.AppMasterBase: finish request 0 ps containers 2019-05-05 17:16:58,957 INFO xdl.AppMasterBase: finish request 0 ps containers 2019-05-05 17:16:59,158 INFO xdl.AppMasterBase: finish request 0 ps containers 2019-05-05 17:16:59,362 INFO xdl.AppMasterBase: finish request 1 ps containers 2019-05-05 17:16:59,562 INFO xdl.AppMasterBase: finish request all ps containers 2019-05-05 17:16:59,562 INFO xdl.AppMasterBase: Making resource request for worker container 0 2019-05-05 17:16:59,564 INFO xdl.AppMasterBase: finish request 0 worker containers,total worker container is 1 2019-05-05 17:16:59,766 INFO xdl.AppMasterBase: finish request 0 worker containers,total worker container is 1 2019-05-05 17:16:59,967 INFO xdl.AppMasterBase: finish request 0 worker containers,total worker container is 1 2019-05-05 17:17:00,168 INFO xdl.AppMasterBase: finish request 0 worker containers,total worker container is 1 2019-05-05 17:17:00,371 INFO xdl.AppMasterBase: finish request 1 worker containers,total worker container is 1 2019-05-05 17:17:00,571 INFO xdl.AppMasterBase: finish request all worker containers 2019-05-05 17:17:00,586 INFO xdl.AppMasterBase: container start command is $JAVA_HOME/bin/java -Xmx256M com.alibaba.xdl.ContainerRunner -c=config.tree_init.json -j=scheduler -i=0 -z=master:2181,worker1:2181 -r=/xdl -u=root -v=tdm_mock.tar.gz -cpuset=_CPULIST -cd=GPU_LIST_PLACEHOLDER 1> /stdout 2> /stderr 2019-05-05 17:17:00,587 INFO xdl.AppMasterBase: Launching scheduler container [container_1556600737538_0007_01_000002] 2019-05-05 17:17:00,660 INFO xdl.AppMasterBase: container Id is container_1556600737538_0007_01_000002,node Id is worker1:43446 2019-05-05 17:17:00,660 INFO xdl.AppMasterBase: container start command is $JAVA_HOME/bin/java -Xmx256M com.alibaba.xdl.ContainerRunner -c=config.tree_init.json -j=ps -i=0 -z=master:2181,worker1:2181 -r=/xdl -u=root -v=tdm_mock.tar.gz -cpuset=_CPULIST -cd=GPU_LIST_PLACEHOLDER 1> /stdout 2> /stderr 2019-05-05 17:17:00,660 INFO xdl.AppMasterBase: Launching ps container [container_1556600737538_0007_01_000003] 2019-05-05 17:17:00,669 INFO xdl.AppMasterBase: container Id is container_1556600737538_0007_01_000003,node Id is worker1:43446 2019-05-05 17:17:00,670 INFO xdl.AppMasterBase: container start command is $JAVA_HOME/bin/java -Xmx256M com.alibaba.xdl.ContainerRunner -c=config.tree_init.json -j=worker -i=0 -z=master:2181,worker1:2181 -r=/xdl -u=root -v=tdm_mock.tar.gz -cpuset=_CPULIST -cd=GPU_LIST_PLACEHOLDER 1> /stdout 2> /stderr 2019-05-05 17:17:00,670 INFO xdl.AppMasterBase: Launching worker container [container_1556600737538_0007_01_000004] 2019-05-05 17:17:00,678 INFO xdl.AppMasterBase: container Id is container_1556600737538_0007_01_000004,node Id is worker1:43446 2019-05-05 17:25:38,067 INFO xdl.AppMasterBase: response container size 1 2019-05-05 17:25:38,067 INFO xdl.AppMasterBase: Completed container container_1556600737538_0007_01_000002 finish state is COMPLETE exit status 1 2019-05-05 17:25:38,068 INFO xdl.AppMasterBase: container_1556600737538_0007_01_000002 scheduler container lost, lose exit status is 1, Launch it again 2019-05-05 17:25:38,068 INFO xdl.AppMasterBase: node scheduler:0 fail times FailoverTimes [failoverTimes=1] 2019-05-05 17:25:38,071 INFO xdl.AppMasterBase: 0 container has reallocated 2019-05-05 17:25:40,075 INFO xdl.AppMasterBase: 1 container has reallocated 2019-05-05 17:25:40,075 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000005, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 4, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] 2019-05-05 17:25:40,076 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000005, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 4, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] matches request Request: [role: scheduler, index: 0, request: Capability[<memory:4096, vCores:4>]Priority[4]AllocationRequestId[0]ExecutionTypeRequest[{Execution Type: GUARANTEED, Enforce Execution Type: false}]Resource Profile[null], failtimes: FailoverTimes [failoverTimes=1], ] 2019-05-05 17:25:40,076 INFO xdl.AppMasterBase: response container size 1 2019-05-05 17:25:40,076 INFO xdl.AppMasterBase: Completed container container_1556600737538_0007_01_000003 finish state is COMPLETE exit status 1 2019-05-05 17:25:40,076 INFO xdl.AppMasterBase: container_1556600737538_0007_01_000003 ps container lost, lose exit status is 1, Launch it again 2019-05-05 17:25:40,076 INFO xdl.AppMasterBase: node ps:0 fail times FailoverTimes [failoverTimes=1] 2019-05-05 17:25:40,078 INFO xdl.AppMasterBase: 0 container has reallocated 2019-05-05 17:25:42,080 INFO xdl.AppMasterBase: 0 container has reallocated 2019-05-05 17:25:44,083 INFO xdl.AppMasterBase: 1 container has reallocated 2019-05-05 17:25:44,083 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000006, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 4, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] 2019-05-05 17:25:44,084 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000006, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 4, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] matches request Request: [role: ps, index: 0, request: Capability[<memory:775, vCores:4>]Priority[4]AllocationRequestId[0]ExecutionTypeRequest[{Execution Type: GUARANTEED, Enforce Execution Type: false}]Resource Profile[null], failtimes: FailoverTimes [failoverTimes=1], ] 2019-05-05 17:25:48,181 INFO xdl.AppMasterBase: container start command is $JAVA_HOME/bin/java -Xmx256M com.alibaba.xdl.ContainerRunner -c=config.tree_init.json -j=scheduler -i=0 -z=master:2181,worker1:2181 -r=/xdl -u=root -v=tdm_mock.tar.gz -cpuset=_CPULIST -cd=GPU_LIST_PLACEHOLDER 1> /stdout 2> /stderr 2019-05-05 17:25:48,181 INFO xdl.AppMasterBase: Launching scheduler container [container_1556600737538_0007_01_000005] 2019-05-05 17:25:48,190 INFO xdl.AppMasterBase: container Id is container_1556600737538_0007_01_000005,node Id is worker1:43446 2019-05-05 17:25:48,190 INFO xdl.AppMasterBase: failover launch node scheduler:0 sucess! 2019-05-05 17:25:48,190 INFO xdl.AppMasterBase: container start command is $JAVA_HOME/bin/java -Xmx256M com.alibaba.xdl.ContainerRunner -c=config.tree_init.json -j=ps -i=0 -z=master:2181,worker1:2181 -r=/xdl -u=root -v=tdm_mock.tar.gz -cpuset=_CPULIST -cd=GPU_LIST_PLACEHOLDER 1> /stdout 2> /stderr 2019-05-05 17:25:48,191 INFO xdl.AppMasterBase: Launching ps container [container_1556600737538_0007_01_000006] 2019-05-05 17:25:48,198 INFO xdl.AppMasterBase: container Id is container_1556600737538_0007_01_000006,node Id is worker1:43446 2019-05-05 17:25:48,199 INFO xdl.AppMasterBase: failover launch node ps:0 sucess! 2019-05-05 17:27:48,657 INFO xdl.AppMasterBase: response container size 1 2019-05-05 17:27:48,657 INFO xdl.AppMasterBase: Completed container container_1556600737538_0007_01_000006 finish state is COMPLETE exit status 1 2019-05-05 17:27:48,657 INFO xdl.AppMasterBase: container_1556600737538_0007_01_000006 ps container lost, lose exit status is 1, Launch it again 2019-05-05 17:27:48,657 INFO xdl.AppMasterBase: node ps:0 fail times FailoverTimes [failoverTimes=2] 2019-05-05 17:27:48,659 INFO xdl.AppMasterBase: 0 container has reallocated 2019-05-05 17:27:50,662 INFO xdl.AppMasterBase: 1 container has reallocated 2019-05-05 17:27:50,662 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000008, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 5, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] 2019-05-05 17:27:50,663 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000008, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 5, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] matches request Request: [role: ps, index: 0, request: Capability[<memory:775, vCores:4>]Priority[5]AllocationRequestId[0]ExecutionTypeRequest[{Execution Type: GUARANTEED, Enforce Execution Type: false}]Resource Profile[null], failtimes: FailoverTimes [failoverTimes=2], ] 2019-05-05 17:27:50,663 INFO xdl.AppMasterBase: response container size 1 2019-05-05 17:27:50,663 INFO xdl.AppMasterBase: Completed container container_1556600737538_0007_01_000005 finish state is COMPLETE exit status 1 2019-05-05 17:27:50,663 INFO xdl.AppMasterBase: container_1556600737538_0007_01_000005 scheduler container lost, lose exit status is 1, Launch it again 2019-05-05 17:27:50,663 INFO xdl.AppMasterBase: node scheduler:0 fail times FailoverTimes [failoverTimes=2] 2019-05-05 17:27:50,664 INFO xdl.AppMasterBase: 0 container has reallocated 2019-05-05 17:27:52,667 INFO xdl.AppMasterBase: 1 container has reallocated 2019-05-05 17:27:52,667 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000009, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 5, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] 2019-05-05 17:27:52,667 INFO xdl.AppMasterBase: response container Container: [ContainerId: container_1556600737538_0007_01_000009, AllocationRequestId: 0, Version: 0, NodeId: worker1:43446, NodeHttpAddress: worker1:8042, Resource: <memory:22755, vCores:4>, Priority: 5, Token: Token { kind: ContainerToken, service: 10.216.45.30:43446 }, ExecutionType: GUARANTEED, ] matches request Request: [role: scheduler, index: 0, request: Capability[<memory:4096, vCores:4>]Priority[5]AllocationRequestId[0]ExecutionTypeRequest[{Execution Type: GUARANTEED, Enforce Execution Type: false}]Resource Profile[null], failtimes: FailoverTimes [failoverTimes=2], ] 2019-05-05 17:27:57,101 INFO xdl.AppMasterBase: container has failed 4 times, shutdown this application 2019-05-05 17:27:57,102 ERROR xdl.AppMasterRunner: run error! java.lang.RuntimeException: container has failed 4 times,shutdown this application at com.alibaba.xdl.AppMasterBase.processResponse(AppMasterBase.java:399) at com.alibaba.xdl.AppMasterBase.waitForWorkerFinish(AppMasterBase.java:303) at com.alibaba.xdl.AppMasterBase.run(AppMasterBase.java:182) at com.alibaba.xdl.AppMasterRunner.main(AppMasterRunner.java:81) 2019-05-05 17:27:57,110 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 2019-05-05 17:27:57,259 INFO imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting 2019-05-05 17:27:57,261 INFO zookeeper.ZooKeeper: Session: 0x65045cd626be0070 closed 2019-05-05 17:27:57,261 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x65045cd626be0070