apache-spark-on-k8s / kubernetes-HDFS

Repository holding configuration files for running an HDFS cluster in Kubernetes
Apache License 2.0
398 stars 185 forks source link

NameNode HA: HDFS access gives error "Operation category READ is not supported in state standby" #54

Open juv opened 6 years ago

juv commented 6 years ago

I've set up NameNode HA. A problem is that my service will route to either active namenode or standby namenode. For example, I have a Spark History server running with parameter SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=hdfs://hadoop-hdfs-nn.my-namespace.svc:9000/shared/spark-logs". The following error Operation category READ is not supported in state standby occurs:

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
    at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
    at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1779)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setSafeMode(NameNodeRpcServer.java:1063)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setSafeMode(ClientNamenodeProtocolServerSideTranslatorPB.java:739)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

    at org.apache.hadoop.ipc.Client.call(Client.java:1475)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy10.setSafeMode(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:666)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy11.setSafeMode(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2596)
    at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1223)
    at org.apache.spark.deploy.history.FsHistoryProvider.isFsInSafeMode(FsHistoryProvider.scala:673)
    at org.apache.spark.deploy.history.FsHistoryProvider.isFsInSafeMode(FsHistoryProvider.scala:666)
    at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:159)
    at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:156)
    at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:78)
... 6 more

I guess that the Spark History Server has accessed the k8s service and got routed to the standby namenode instead of the active namenode. Have you thought about this problem @kimoonkim? I have an idea to solve this:

  1. add another service hadoop-hdfs-active-nn
  2. select only active namenode pod
  3. have another lightweight pod running that keeps track of the active namenode
  4. set the label selector to the new active namenode when the active namenode is switched

Then, let all clients connect to the hadoop-hdfs-active-nn service instead of hadoop-hdfs-nn. I am not sure if it is possible to set the label of a service from within a pod though...

vvbogdanov87 commented 6 years ago

You should configure a client to work with HDFS in HA mode manually. You can get params from hdfs-site.xml kubectl describe configmap hdfs-config Some apps can read the params from hdfs-config.xml Just put hdfs-config.xml to /etc/hadoop/conf Some apps can be configured using flags. I use next syntax to launch spark(2.3.1) jobs on k8s:

spark-submit ... \
    --conf spark.hadoop.fs.defaultFS="hdfs://hdfs-k8s" \
    --conf spark.hadoop.fs.default.name="hdfs://hdfs-k8s" \
    --conf spark.hadoop.dfs.nameservices="hdfs-k8s" \
    --conf spark.hadoop.dfs.ha.namenodes.hdfs-k8s="nn0,nn1" \
    --conf spark.hadoop.dfs.namenode.rpc-address.hdfs-k8s.nn0="hdfs-namenode-0.hdfs-namenode.default:8020" \
    --conf spark.hadoop.dfs.namenode.rpc-address.hdfs-k8s.nn1="hdfs-namenode-1.hdfs-namenode.default:8020" \
    --conf spark.hadoop.dfs.client.failover.proxy.provider.hdfs-k8s="org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" \

Homegrown apps can be configured via code changes https://stackoverflow.com/a/35911455

maver1ck commented 6 years ago

I did the same mounting secrets with core-site and hdfs-site into Spark app. The options you gave didn't work.

xianliangz commented 5 years ago

We have similar idea with @juv and I am implementing it. Basically We create a ZooKeeper watcher to get notification on the event of ZooKeeper znode /hadoop-ha/ns/ActiveStandbyElectorLock. Then we will get the info about which NameNode is active. We label the active NameNode so that the k8s service only route requests to active NameNode. In this way, client don't need to retry to figure out which is active. I would like to seek feedback on this solution from experts. Thank you!

dennischin commented 5 years ago

@maver1ck were you able to get this working with configmaps? I am also trying with configmaps and I am failing as well.

My hdfs-ha cluster name is not resolvable even though all my info is properly mounted in configmaps and the core-site.xml and hdfs-site.xml files are properly mounted.

Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs-k8s at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:678) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:619) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1866) at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:721) at org.apache.spark.util.Utils$.fetchFile(Utils.scala:496) at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:811) at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:803) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:803) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:375) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.UnknownHostException: hdfs-k8s

vvbogdanov87 commented 5 years ago

@dennischin Can you connect to hdfs-client pod and traverse files on hdfs?

kubectl get pod -l=app=hdfs-client
kubectl exec -ti <podname> bash
hadoop fs -ls /
hadoop fs -put /local/file /

You can find configuration on hdfs-client pod /etc/hadoop-custom-conf/

dennischin commented 5 years ago

@vvbogdanov87 , yes i can interact with hdfs (outside of spark) without issue.

geekyouth commented 4 years ago

vim spark-evn.sh

export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://mycluster/spark-job-log" YARN_CONF_DIR=/opt/hadoop-2.7.2-ha/etc/hadoop