shikanon / kubeflow-manifests

kubeflow国内一键安装文件
GNU General Public License v3.0
338 stars 117 forks source link

全部成功后,却不能访问30000的情况 #48

Open Kang9779 opened 3 years ago

Kang9779 commented 3 years ago

你好,今天出现了所有pod正常运行,但是30000端口访问不了的情况,请问该如何排查? image image

shikanon commented 3 years ago

@kangzhang0709 30000端口访问不了是指什么错误?30000端口 istio 暴露到集群node的,我看你 service,应该可以通过 nodeIP:30000 访问到,如果不知道nodeIP 也可以通过 kubectl port-forward 直接将 80 映射到本地。

kubectl -nistio-system port-forward svc/istio-ingressgateway 8000:80
Kang9779 commented 3 years ago

@kangzhang0709 30000端口访问不了是指什么错误?30000端口 istio 暴露到集群node的,我看你 service,应该可以通过 nodeIP:30000 访问到,如果不知道nodeIP 也可以通过 kubectl port-forward 直接将 80 映射到本地。

kubectl -nistio-system port-forward svc/istio-ingressgateway 8000:80

:(,排查了半天并不知道哪里出错了。所有的pod都是正常runing状态。

shikanon commented 3 years ago

@kangzhang0709 看看报错信息?:

curl -vvv -L <你的k8s节点ip地址>:30000
ylylylylylyl commented 3 years ago

我的也是,访问30000端口报403,经排查发现authservice-0容器报错如下,有大佬知道怎么回事吗? time="2021-08-03T03:31:00Z" level=error msg="OIDC provider setup failed, retrying in 10 seconds: Get http://dex.auth.svc.cluster.local:5556/dex/.well-known/openid-configuration: dial tcp: lookup dex.auth.svc.cluster.local on 169.254.25.10:53: no such host"

shikanon commented 3 years ago

@ylylylylylyl 说明你的 dex 都没安装? kubectl get svc -nauth dex 看看是否存在?

Kang9779 commented 3 years ago
  • About to connect() to 10.12.1.12 port 30000 (#0)
  • Trying 10.12.1.12...
  • Connected to 10.12.1.12 (10.12.1.12) port 30000 (#0)

GET / HTTP/1.1 User-Agent: curl/7.29.0 Host: 10.12.1.12:30000 Accept: /

< HTTP/1.1 403 Forbidden < date: Tue, 03 Aug 2021 03:11:30 GMT < server: istio-envoy < content-length: 0 <

  • Connection #0 to host 10.12.1.12 left intact

我的也是,访问30000端口报403,经排查发现authservice-0容器报错如下,有大佬知道怎么回事吗? time="2021-08-03T03:31:00Z" level=error msg="OIDC provider setup failed, retrying in 10 seconds: Get http://dex.auth.svc.cluster.local:5556/dex/.well-known/openid-configuration: dial tcp: lookup dex.auth.svc.cluster.local on 169.254.25.10:53: no such host"

后来我又卸载重装了一遍就好了。。

tianya092 commented 2 years ago

我安装过程没看到提示错误,但是服务没法完全起来,好多一直init,还有error, NAME READY STATUS RESTARTS AGE admission-webhook-deployment-6fb9d65887-q55ls 1/1 Running 0 24m cache-deployer-deployment-7558d65bf4-9blvv 0/2 PodInitializing 0 24m cache-server-67d98b4ddd-qlr7z 0/2 Init:0/1 0 24m centraldashboard-7b7676d8bd-22nws 1/1 Running 0 24m jupyter-web-app-deployment-66f74586d9-jcwlk 1/1 Running 0 24m katib-controller-77675c88df-4vcfz 1/1 Running 0 24m katib-db-manager-646695754f-889qq 0/1 Running 6 24m katib-mysql-5bb5bd9957-cb2xm 1/1 Running 0 24m katib-ui-55fd4bd6f9-l8882 1/1 Running 0 24m kfserving-controller-manager-0 0/2 ContainerCreating 0 22m kubeflow-pipelines-profile-controller-5698bf57cf-wqgfq 1/1 Running 0 24m metacontroller-0 1/1 Running 0 22m metadata-envoy-deployment-76d65977f7-7nmdg 1/1 Running 0 24m metadata-grpc-deployment-697d9c6c67-vqdhs 0/2 PodInitializing 0 24m metadata-writer-58cdd57678-7gdfd 0/2 PodInitializing 0 24m minio-6d6784db95-p8rkx 0/2 PodInitializing 0 24m ml-pipeline-85fc99f899-7pnt7 0/2 PodInitializing 0 24m ml-pipeline-persistenceagent-65cb9594c7-m5wfm 0/2 PodInitializing 0 24m ml-pipeline-scheduledworkflow-7f8d8dfc69-hpwgw 0/2 PodInitializing 0 24m ml-pipeline-ui-5c765cc7bd-w9tqv 0/2 PodInitializing 0 24m ml-pipeline-viewer-crd-5b8df7f458-c98pl 0/2 PodInitializing 0 24m ml-pipeline-visualizationserver-56c5ff68d5-gcnsd 0/2 PodInitializing 0 24m mpi-operator-789f88879-stq6m 0/1 Error 1 24m mxnet-operator-7fff864957-lq6zl 0/1 Error 0 24m mysql-56b554ff66-559zn 0/2 PodInitializing 0 24m notebook-controller-deployment-74d9584477-9mnqj 1/1 Running 0 24m profiles-deployment-67b4666796-lwnzm 0/2 ContainerCreating 0 24m pytorch-operator-fd86f7694-9j5bs 0/2 PodInitializing 0 24m tensorboard-controller-controller-manager-fd6bcffb4-fg2mv 0/3 PodInitializing 0 23m tensorboards-web-app-deployment-5465d687b9-v4n9m 1/1 Running 0 24m tf-job-operator-7bc5cf4cc7-7p298 0/1 CrashLoopBackOff 6 24m volumes-web-app-deployment-88db758b8-pdd44 1/1 Running 0 24m workflow-controller-84dcfc89c-hlbmn 2/2 Running 2 24m xgboost-operator-deployment-5c7bfd57cc-x2rxv 0/2 PodInitializing 0 24m

Chenxs1122 commented 2 years ago

@kangzhang0709 看看报错信息?:

curl -vvv -L <你的k8s节点ip地址>:30000

Log in to Your Account

</div>

tianya092 commented 2 years ago

已收到您的来信,非常感谢!

Chenxs1122 commented 2 years ago

你好,今天出现了所有pod正常运行,但是30000端口访问不了的情况,请问该如何排查? image image

telnet 一下端口是否放开

hecheng64 commented 2 years ago

kubectl get svc -nauth dex

我也重装还是不行,返回403

tianya092 commented 2 years ago

已收到您的来信,非常感谢!

xytsinghua commented 3 months ago
image image image

add env:

image image

然后就可以访问30000端口了

然后再额外执行一下: kubectl apply -f patch/auth.yaml

里面记录着登陆的用户名和密码。

tianya092 commented 3 months ago

已收到您的来信,非常感谢!