apache-spark-on-k8s / kubernetes-HDFS

Repository holding configuration files for running an HDFS cluster in Kubernetes
Apache License 2.0
397 stars 185 forks source link

HDFS #97

Open NoIncomeTaxPlease opened 3 months ago

NoIncomeTaxPlease commented 3 months ago

Hi, I have my K8S setup where we have deployed HDFS as helm chart with 3 Journalnodes, 3 Datanodes, 2 Namenodes and other HDFS3-HA components.

But when we do helm install, the Namenode pod goes for few restarts with below error and than it automatically comes up, how to get rid of this error?

I want 0 restarts in NN (Namenode) pod.

ERROR namenode.NameNode: Failed to start namenode java.net.SocketException: Call From hdfs-k8s to null:0 failed on socket exception: java.net.SocketException: Unresolved address; For more details see: http://wiki.apache.org/hadoop/SocketException

Just to add, above logs I'm showing are part of the huge logs, I have checked official HDFS HA document, tried various configurations, readinessProbe and all other options, but nothing is helping me get rid of these pod restarts.

Please help with any configuration or other changes.

Thanks.

NoIncomeTaxPlease commented 3 months ago

Any updates?