FederatedAI / KubeFATE

Manage federated learning workload using cloud native technologies.
Apache License 2.0
424 stars 221 forks source link

1.10.0 k8s deployment hdfs error 255 #931

Open patricxu opened 9 months ago

patricxu commented 9 months ago

We have gotten a hdfs error on both clusters when try to read/write the hdfs. The system has two clusters, fate-10000 and fate 9998, on different PC, both of which are deployed with kube. The computing engin is spark. The log on 10000 微信图片_20240219101822 The settings on 10000 微信图片_20240219101913 微信图片_20240219101923 微信图片_20240219101932 微信图片_20240219102851 All the components run on 10000 微信图片_20240219101939 The logs on 9998 微信图片_20240219101901