microsoft / AI-System

System for AI Education Resource.
https://microsoft.github.io/AI-System/
Creative Commons Attribution 4.0 International
3.64k stars 451 forks source link

openpai k8s error #26

Open fire-heart-li opened 3 years ago

fire-heart-li commented 3 years ago

Starting kubernetes... setup k8s cluster

PLAY [localhost] *** [WARNING]: Could not match supplied host pattern, ignoring: bastion

PLAY [bastion[0]] ** skipping: no hosts matched

PLAY [k8s-cluster:etcd] **** included: /home/openpai/pai-deploy/kubespray/roles/bootstrap-os/tasks/bootstrap-debian.yml for stu-276, iair279, stu-282

PLAY [k8s-cluster:etcd] ****

TASK [kubernetes/preinstall : Stop if access_ip is not pingable] *** changed: [iair279] changed: [stu-276] changed: [stu-282] included: /home/openpai/pai-deploy/kubespray/roles/container-engine/docker/tasks/set_facts_dns.yml for stu-276, iair279, stu-282 [WARNING]: flush_handlers task does not support when conditional

TASK [download : prep_download | Create staging directory on remote node] ** changed: [stu-276] changed: [iair279] changed: [stu-282] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/prep_kubeadm_images.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276

TASK [download : download_file | Create dest directory on node] **** changed: [stu-276] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276 [WARNING]: noop task does not support when conditional included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276

TASK [download : download_file | Create dest directory on node] **** changed: [stu-282] changed: [iair279] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ** changed: [stu-276 -> 192.168.1.187] changed: [iair279 -> 192.168.1.187] changed: [stu-282 -> 192.168.1.187] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ** changed: [stu-276 -> 192.168.1.187] changed: [stu-282 -> 192.168.1.187] changed: [iair279 -> 192.168.1.187] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ** changed: [stu-276 -> 192.168.1.187] changed: [iair279 -> 192.168.1.187] changed: [stu-282 -> 192.168.1.187] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ** changed: [iair279 -> 192.168.1.187] changed: [stu-276 -> 192.168.1.187] changed: [stu-282 -> 192.168.1.187] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276

TASK [download : download_container | Download image if required] ** changed: [stu-276 -> 192.168.1.187] included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276 included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276 FAILED - RETRYING: download_container | Download image if required (4 retries left). FAILED - RETRYING: download_container | Download image if required (3 retries left). FAILED - RETRYING: download_container | Download image if required (2 retries left). FAILED - RETRYING: download_container | Download image if required (1 retries left).

TASK [download : download_container | Download image if required] ** fatal: [stu-276 -> 192.168.1.187]: FAILED! => {"attempts": 4, "changed": true, "cmd": ["/usr/bin/docker", "pull", "k8s.gcr.io/cluster-proportional-autoscaler-amd64:1.6.0"], "delta": "0:00:15.027078", "end": "2021-05-11 19:29:07.932611", "msg": "non-zero return code", "rc": 1, "start": "2021-05-11 19:28:52.905533", "stderr": "Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)", "stderr_lines": ["Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT *****

PLAY RECAP ***** iair279 : ok=204 changed=7 unreachable=0 failed=0 skipped=192 rescued=0 ignored=0
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
stu-276 : ok=295 changed=8 unreachable=0 failed=1 skipped=258 rescued=0 ignored=0
stu-282 : ok=204 changed=7 unreachable=0 failed=0 skipped=192 rescued=0 ignored=0