Angel-ML / angel

A Flexible and Powerful Parameter Server for large-scale machine learning
Other
6.74k stars 1.6k forks source link

Authentication fails with SONA #801

Open zxsimple opened 5 years ago

zxsimple commented 5 years ago

@ouyangwen-it

How kerberos properties angel.kerberos.keytab and angel.kerberos.principal used in SONA?

ouyangwen-it commented 5 years ago

you should use spark-submit option:--principal, --keytab while use SONA.

zxsimple commented 5 years ago

No matter I specify --principal and --keytab option or spark.yarn.keytab and spark.yarn.principal configuration, I will get Connection Refused Exception. Please note kinit command works fine.

Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over 141 after 1 fail over attempts. Trying to fail over after sleeping for 30266ms. | org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:146)
java.net.ConnectException: Call From host-xxx/xxx to host-yyy:26004 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:815)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:746)
    at org.apache.hadoop.ipc.Client.call(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1460)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy27.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:231)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:202)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
    at com.sun.proxy.$Proxy28.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235)
    at com.tencent.angel.client.yarn.AngelYarnClient.startPSServer(AngelYarnClient.java:140)
    at com.tencent.angel.client.AngelPSClient.startPS(AngelPSClient.java:115)
    at com.tencent.angel.spark.context.AngelPSContext$.launchAngel(AngelPSContext.scala:301)
    at com.tencent.angel.spark.context.AngelPSContext$.apply(AngelPSContext.scala:265)
    at com.tencent.angel.spark.context.PSContext$.liftedTree1$1(PSContext.scala:85)
    at com.tencent.angel.spark.context.PSContext$.instance(PSContext.scala:83)
    at com.tencent.angel.spark.context.PSContext$.getOrCreate(PSContext.scala:67)
    at com.tencent.angel.spark.examples.basic.LR$.main(LR.scala:43)
    at com.tencent.angel.spark.examples.basic.LR.main(LR.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:650)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:763)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:394)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1577)
    at org.apache.hadoop.ipc.Client.call(Client.java:1499)
    ... 27 more
ouyangwen-it commented 5 years ago

Please paste more details logs, can you submit a spark example job without angel.

zxsimple commented 5 years ago

I can submit spark example with or without kerberos authentication.

spark-submit --class org.apache.spark.examples.SparkPi \
    --master yarn-client \
    --keytab [my_keytab] \
    --principal [my_name] \
    --num-executors 4 \
    --driver-memory 512m \
    --executor-memory 512m \
    --executor-cores 1 \
    $SPARK_HOME/examples/jars/spark-examples_2.11-2.1.0.jar 10

That's exception indicates authentication fails with the keytab and user. The same exception just repeated after several seconds.

2019-07-02 20:01:18,221 | INFO  | [dispatcher-event-loop-14] | Registered executor NettyRpcEndpointRef(null) (xxxxxx:52720) with ID 10 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,272 | INFO  | [dispatcher-event-loop-2] | Registering block manager host-xxxxx:22744 with 2004.6 MB RAM, BlockManagerId(10,xxxxx, None) | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,498 | INFO  | [dispatcher-event-loop-13] | Registered executor NettyRpcEndpointRef(null) (xxxxxx:24084) with ID 8 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,539 | INFO  | [dispatcher-event-loop-10] | Registering block manager host-xxxxx:22614 with 2004.6 MB RAM, BlockManagerId(8, xxxxxx, 22614, None) | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:41,370 | INFO  | [Driver] | Failing over to 140 | org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.performFailover(ConfiguredRMFailoverProxyProvider.java:100)
2019-07-02 20:01:41,372 | WARN  | [Driver] | Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] | org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:726)
2019-07-02 20:01:41,372 | INFO  | [Driver] | Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over 140 after 2 fail over attempts. Trying to fail over after sleeping for 39296ms. | org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:146)
java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "xxxx/xxxxx"; destination host is: "host-xxxxx":26004; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:796)
    at org.apache.hadoop.ipc.Client.call(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1460)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy21.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:231)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:202)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
    at com.sun.proxy.$Proxy22.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235)
    at com.tencent.angel.client.yarn.AngelYarnClient.startPSServer(AngelYarnClient.java:130)
    at com.tencent.angel.client.AngelPSClient.startPS(AngelPSClient.java:115)
    at com.tencent.angel.spark.context.AngelPSContext$.launchAngel(AngelPSContext.scala:301)
    at com.tencent.angel.spark.context.AngelPSContext$.apply(AngelPSContext.scala:265)
    at com.tencent.angel.spark.context.PSContext$.liftedTree1$1(PSContext.scala:85)
    at com.tencent.angel.spark.context.PSContext$.instance(PSContext.scala:83)
    at com.tencent.angel.spark.context.PSContext$.getOrCreate(PSContext.scala:67)
    at com.tencent.angel.spark.examples.basic.LR$.main(LR.scala:43)
    at com.tencent.angel.spark.examples.basic.LR.main(LR.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:650)
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
    at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:731)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1778)
    at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:694)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:784)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:394)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1577)
    at org.apache.hadoop.ipc.Client.call(Client.java:1499)
    ... 27 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
    at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:177)
    at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:404)
    at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:581)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:394)
    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:776)
    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:772)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1778)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:771)
    ... 30 more
2019-07-02 20:02:20,669 | INFO  | [Driver] | Failing over to 141 | org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.performFailover(ConfiguredRMFailoverProxyProvider.java:100)
2019-07-02 20:02:20,671 | INFO  | [Driver] | Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over 141 after 3 fail over attempts. Trying to fail over after sleeping for 39585ms. | org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:146)
java.net.ConnectException: Call From host-xxxx/xxxxx toxxxx:26004 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:815)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:746)
    at org.apache.hadoop.ipc.Client.call(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1460)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy21.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:231)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:202)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
    at com.sun.proxy.$Proxy22.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235)
    at com.tencent.angel.client.yarn.AngelYarnClient.startPSServer(AngelYarnClient.java:130)
    at com.tencent.angel.client.AngelPSClient.startPS(AngelPSClient.java:115)
    at com.tencent.angel.spark.context.AngelPSContext$.launchAngel(AngelPSContext.scala:301)
    at com.tencent.angel.spark.context.AngelPSContext$.apply(AngelPSContext.scala:265)
    at com.tencent.angel.spark.context.PSContext$.liftedTree1$1(PSContext.scala:85)
    at com.tencent.angel.spark.context.PSContext$.instance(PSContext.scala:83)
    at com.tencent.angel.spark.context.PSContext$.getOrCreate(PSContext.scala:67)
    at com.tencent.angel.spark.examples.basic.LR$.main(LR.scala:43)
    at com.tencent.angel.spark.examples.basic.LR.main(LR.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:650)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:763)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:394)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1577)
    at org.apache.hadoop.ipc.Client.call(Client.java:1499)
    ... 27 more
zxsimple commented 5 years ago

Did you specify the keytab file as client local file or submitted it with --file option?

ouyangwen-it commented 5 years ago

keytab file is client local file(local path), where is your keytab file.

zxsimple commented 5 years ago

Yes, it is on local.

ouyangwen-it commented 5 years ago

I can submit spark example with or without kerberos authentication.

spark-submit --class org.apache.spark.examples.SparkPi \
    --master yarn-client \
    --keytab [my_keytab] \
    --principal [my_name] \
    --num-executors 4 \
    --driver-memory 512m \
    --executor-memory 512m \
    --executor-cores 1 \
    $SPARK_HOME/examples/jars/spark-examples_2.11-2.1.0.jar 10

That's exception indicates authentication fails with the keytab and user. The same exception just repeated after several seconds.

2019-07-02 20:01:18,221 | INFO  | [dispatcher-event-loop-14] | Registered executor NettyRpcEndpointRef(null) (xxxxxx:52720) with ID 10 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,272 | INFO  | [dispatcher-event-loop-2] | Registering block manager host-xxxxx:22744 with 2004.6 MB RAM, BlockManagerId(10,xxxxx, None) | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,498 | INFO  | [dispatcher-event-loop-13] | Registered executor NettyRpcEndpointRef(null) (xxxxxx:24084) with ID 8 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:18,539 | INFO  | [dispatcher-event-loop-10] | Registering block manager host-xxxxx:22614 with 2004.6 MB RAM, BlockManagerId(8, xxxxxx, 22614, None) | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-07-02 20:01:41,370 | INFO  | [Driver] | Failing over to 140 | org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.performFailover(ConfiguredRMFailoverProxyProvider.java:100)
2019-07-02 20:01:41,372 | WARN  | [Driver] | Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] | org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:726)
2019-07-02 20:01:41,372 | INFO  | [Driver] | Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over 140 after 2 fail over attempts. Trying to fail over after sleeping for 39296ms. | org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:146)
java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "xxxx/xxxxx"; destination host is: "host-xxxxx":26004; 
  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:796)
  at org.apache.hadoop.ipc.Client.call(Client.java:1528)
  at org.apache.hadoop.ipc.Client.call(Client.java:1460)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
  at com.sun.proxy.$Proxy21.getNewApplication(Unknown Source)
  at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:231)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:202)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
  at com.sun.proxy.$Proxy22.getNewApplication(Unknown Source)
  at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227)
  at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235)
  at com.tencent.angel.client.yarn.AngelYarnClient.startPSServer(AngelYarnClient.java:130)
  at com.tencent.angel.client.AngelPSClient.startPS(AngelPSClient.java:115)
  at com.tencent.angel.spark.context.AngelPSContext$.launchAngel(AngelPSContext.scala:301)
  at com.tencent.angel.spark.context.AngelPSContext$.apply(AngelPSContext.scala:265)
  at com.tencent.angel.spark.context.PSContext$.liftedTree1$1(PSContext.scala:85)
  at com.tencent.angel.spark.context.PSContext$.instance(PSContext.scala:83)
  at com.tencent.angel.spark.context.PSContext$.getOrCreate(PSContext.scala:67)
  at com.tencent.angel.spark.examples.basic.LR$.main(LR.scala:43)
  at com.tencent.angel.spark.examples.basic.LR.main(LR.scala)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:650)
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
  at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:731)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1778)
  at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:694)
  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:784)
  at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:394)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1577)
  at org.apache.hadoop.ipc.Client.call(Client.java:1499)
  ... 27 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
  at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:177)
  at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:404)
  at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:581)
  at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:394)
  at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:776)
  at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:772)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1778)
  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:771)
  ... 30 more
2019-07-02 20:02:20,669 | INFO  | [Driver] | Failing over to 141 | org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.performFailover(ConfiguredRMFailoverProxyProvider.java:100)
2019-07-02 20:02:20,671 | INFO  | [Driver] | Exception while invoking getNewApplication of class ApplicationClientProtocolPBClientImpl over 141 after 3 fail over attempts. Trying to fail over after sleeping for 39585ms. | org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:146)
java.net.ConnectException: Call From host-xxxx/xxxxx toxxxx:26004 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:815)
  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:746)
  at org.apache.hadoop.ipc.Client.call(Client.java:1528)
  at org.apache.hadoop.ipc.Client.call(Client.java:1460)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
  at com.sun.proxy.$Proxy21.getNewApplication(Unknown Source)
  at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:231)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:202)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
  at com.sun.proxy.$Proxy22.getNewApplication(Unknown Source)
  at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227)
  at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235)
  at com.tencent.angel.client.yarn.AngelYarnClient.startPSServer(AngelYarnClient.java:130)
  at com.tencent.angel.client.AngelPSClient.startPS(AngelPSClient.java:115)
  at com.tencent.angel.spark.context.AngelPSContext$.launchAngel(AngelPSContext.scala:301)
  at com.tencent.angel.spark.context.AngelPSContext$.apply(AngelPSContext.scala:265)
  at com.tencent.angel.spark.context.PSContext$.liftedTree1$1(PSContext.scala:85)
  at com.tencent.angel.spark.context.PSContext$.instance(PSContext.scala:83)
  at com.tencent.angel.spark.context.PSContext$.getOrCreate(PSContext.scala:67)
  at com.tencent.angel.spark.examples.basic.LR$.main(LR.scala:43)
  at com.tencent.angel.spark.examples.basic.LR.main(LR.scala)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:650)
Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
  at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
  at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:763)
  at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:394)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1577)
  at org.apache.hadoop.ipc.Client.call(Client.java:1499)
  ... 27 more

it means submit spark example failed? is your cluster in normal running?