kontena / pharos-cluster

Pharos - The Kubernetes Distribution
https://k8spharos.dev/
Apache License 2.0
311 stars 43 forks source link

Error in step "Validate cluster version": SSL_connect returned=1 errno=0 state=SSLv3/TLS write client hello: wrong version number (OpenSSL::SSL::SSLError) #906

Closed cmonty14 closed 5 years ago

cmonty14 commented 5 years ago

What happened: After upgrading to 2.1.0-rc.2 I get this error in step "Validate cluster version":

I, [2018-12-10T01:08:16.603309 #30007]  INFO -- K8s::Transport: Using config with server=https://10.97.206.191:6443
    [vm191-kontena] got error (Excon::Error::Socket): SSL_connect returned=1 errno=0 state=SSLv3/TLS write client hello: wrong version number (OpenSSL::SSL::SSLError)

The debug trace file is available here.

What you expected to happen: Cluster deployment with setting up master & clients using kubectl should succeed.

How to reproduce it (as minimally and precisely as possible):

  1. chpharos use 2.1.0-rc.2
  2. pharos up -c cluster.yml

Anything else we need to know?: This issue was not occuring with pharos 2.0.4.

Environment:

cluster.yml:

hosts:
  - address: "10.97.206.191"
    user: thomas
    ssh_key_path: /home/thomas/.ssh/id_rsa.pub
    role: master
    environment:
      http_proxy: http://proxy:8080
      HTTP_proxy: http://proxy:8080
      HTTPS_PROXY: http://proxy:8080
      NO_PROXY: localhost,127.0.0.1,10.97.206.0/24,192.168.1.0/24,10.10.0.0/12,wdf.xyz.corp
    container_runtime: cri-o
  - address: "10.97.206.192"
    role: worker
    container_runtime: cri-o
  - address: "10.97.206.193"
    role: worker
    container_runtime: cri-o
  - address: "10.97.206.194"
    role: worker
    container_runtime: cri-o
network:
  dns_replicas: 2
  service_cidr: 10.10.0.0/12
  provider: weave
audit:
  file:
    path: /var/log/kube_audit/audit.json
    max_size: 100
    max_age: 30
    max_backups: 20
admission_plugins:
  - name: AlwaysPullImages
    enabled: true
  - name: LimitRanger
    enabled: false
addons:
  ingress-nginx:
    enabled: true
    node_selector:
      # only provision to nodes having the label "zone: dmz"
      zone: dmz
    configmap:
      # see all supported options: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/configmap.md
      load-balance: least_conn
host-upgrades:
    enabled: true
    interval: 7d
kke commented 5 years ago

Likely fixed by #884

cmonty14 commented 5 years ago

Hm... I didn't change any IP address of any node.

kke commented 5 years ago

The same fix probably gets past this one too by turning the SSL peer verification off for the version validation if it fails.

kke commented 5 years ago

Or you can stop the kube api server from the master before running pharos up, that will make it skip the version validation.

kke commented 5 years ago

Are you using a proxy locally (or running pharos up on the master host)? The HTTP library used by pharos does not understand CIDR's in NO_PROXY.

cmonty14 commented 5 years ago

I'm running a proxy in the LAN that cannot be disabled.

cmonty14 commented 5 years ago

After stopping process kube-apiserver before executing pharos up this issue is solved. Unfortunately there's a new issue with the same error message but in phase "Configure kube client @ vm191-kontena" (Note: vm191-kontena = master).

Should I open another issue for this?

cmonty14 commented 5 years ago

This specific issue is solved.