ReSearchITEng / kubeadm-playbook

Fully fledged (HA) Kubernetes Cluster using official kubeadm, ansible and helm. Tested on RHEL/CentOS/Ubuntu with support of http_proxy, dashboard installed, ingress controller, heapster - using official helm charts
https://researchiteng.github.io/kubeadm-playbook/
The Unlicense
592 stars 102 forks source link

admin conf not getting generated correclty #81

Closed gallexme closed 4 years ago

gallexme commented 5 years ago

I Used a FQDN in in the Inventory File Thats the Result:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNU1URXdOakU0TlRJMU5Wb1hEVEk1TVRFd016RTROVEkxTlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTnpQCkJJbnZiSjkxRmVRTHpWQWFpTnVHNmxkVTUraU5NQTV1Z0JlZDNibWVNakZoaVUvdFRzUXozQ2xkZGxIQjBiSHoKYUJJdGo2N2Q0TDNuaXlOQUJOMGMvTnRqZ05QR2Z1UktHMWlsaE1nSHN4UjJSaEFoSFJub1N1cFRJUTdSL1NwUQpvS3NKTWxMTWZwS1hUTmo2VVhOQ2JHR2F1bERKZG5VTXhjZm4yMU4rYy9MeEFtcWhadkhSb21ybjE2SjBFWU1hCkVQNVZ3aldhWlRxaTMyR2lGVG11SlNrVEtNWGEyT01RV1dYYmh4eXVGRFlQdXVtWFIyKzhxQlZFSFFzQ1dSQ1oKWXlQQVhIMlhXUVNjN1RjcFFoS1oxbldXbWF3V0d6emVmc3liamVoTk5IL2toUGpjd3VPU29oR1QrYzVTNUNVUQp6RnFwL2poVE5xQVdhaFVPaGxVQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFCQmZzVDhFMllXZENjMGVTS1luOGRyOWdDNloKdmxoVjMxUDNISno0MXhCNEp4QXBpWllFZXFKMTQ5S3dCSWFyMGU5Y2NSeUdKM21Benp6TnhVS3ozSGliQlBRYgpzYXd1NXpKRUpLSHJYSVdRMStlN1FEMU83TXhZNVVzbGMvbjNnVlBQQ3gyTUQ0TjJQeUJRRERydFdRaEVCcWFkCmcyL015V1BVdnpodkZqSXBVYjJZdHRvYisvVFJVb3U2Uk0zK2VsU1dnVGNyR2FkS2s3dkZkb1k1YUNuUlhNNE8KSXZIa0xIQ0Z2c1VFNk9zcldFTzY5TFpPdXFiSVd5SjJiVWxKV0tKNWp5Y3VHZ05kZGtuT1hQUTJkTFFoMGJaTAp5aVVHbkpsL2lyR3RlYVl2K0l5K0p4dU9RZkFmZE5xZkcyVFhjaWs5OUNhR0lBWXBiWjByRFJjU0Z1UT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    ²-210-136-24.rev.poneytelecom.eu:6443
  name: node0
contexts:
- context:
    cluster: node0
    user: kubernetes-admin
  name: kubernetes-admin@node0
current-context: kubernetes-admin@node0
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJWUFnWFhUbktodlF3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB4T1RFeE1EWXhPRFV5TlRWYUZ3MHlNREV4TURVeE9EVXlOVGhhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQTAxU21CMmZnRVg0bnFYWEwKYnlEU1ZsWEtoZmg4WVBuN1h3MElrcjgwbHpRekhBQmczSUo4N1A0MFBZTUNVNHZxRE50Vm96TFVvMmxndUt1KwpFNnIySFp5bmQ4VFVHWVlFb2w3QjYrMEdtM2grRnRFZDRDTHo4R3BBQ2tmYmY4VENEczF5SERRa21BekdPNkI5CnJ5ZE11ZGdGVFd6U2VONUJxSmgwbGlkdG1SWEoxRUF1eEFMTFltc1FUZVV1MzBobVJabEdkbkpFenE1d3l4ZUoKQTE1bGZGdkpFUm5mWjZ1RExEaEhoVUZ2MWEvaExKOE9OV1pGeFdwSTUzRVlONGtNaHlCb0VxS3RDMG1FTmNOcApMdU9TL1pzQXU1aWlUOGRLSEJZRVZYcy9maUJ2ems5b3pTbG9TMmRhK1dyKzBaMUlmSXliY2dOY1pNSnArcm91CnhtVzFWUUlEQVFBQm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFJa2NKMFZDaE8wYnEveXRlTVZuK01JZUtoS25seFR6aEh2cApQc2dYRVlRdHZUcTk0ZFJ4M1F6L1hFYlNHRTVHVWQwcGJFOVRSSGREcXVGV0VYQkFYNUwrR25oS0hQNE4yU2pTCnBWUkpwSE9uU045QzVoMEFRbGFXZFI3MW82T20yY21pUXdtMUxkRWY3c0RiMWV4MWpoTWo1UFlUWWZsd21XVGMKM2Nza0NTWVl5T0VBNU93Q0tNZGhkMHRQbHZPMlQ2a3hmdldmeU1TUkdSMVlQdVRBU0xKQkpVRm42VUZGRWo4bAo4MkdDbWs1ZEIrMUZaQXA2RmczUFlmL25WOXhHNThYU1VjM0kvVmFBaVRaaStTUWVLb3luclJyeTl3eUFDSEtzCjhEbmNuZWhNV2gyZldGYmJZUFluUk0rZE9TSFFqQnRuTU5UVmJ2WkNBWDFCVktUMUNlMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMDFTbUIyZmdFWDRucVhYTGJ5RFNWbFhLaGZoOFlQbjdYdzBJa3I4MGx6UXpIQUJnCjNJSjg3UDQwUFlNQ1U0dnFETnRWb3pMVW8ybGd1S3UrRTZyMkhaeW5kOFRVR1lZRW9sN0I2KzBHbTNoK0Z0RWQKNENMejhHcEFDa2ZiZjhUQ0RzMXlIRFFrbUF6R082QjlyeWRNdWRnRlRXelNlTjVCcUpoMGxpZHRtUlhKMUVBdQp4QUxMWW1zUVRlVXUzMGhtUlpsR2RuSkV6cTV3eXhlSkExNWxmRnZKRVJuZlo2dURMRGhIaFVGdjFhL2hMSjhPCk5XWkZ4V3BJNTNFWU40a01oeUJvRXFLdEMwbUVOY05wTHVPUy9ac0F1NWlpVDhkS0hCWUVWWHMvZmlCdnprOW8KelNsb1MyZGErV3IrMFoxSWZJeWJjZ05jWk1KcCtyb3V4bVcxVlFJREFRQUJBb0lCQUdNWk1ySU9WOWhjSENVdgpBN0tjNVlWdXZZV05QR1lOVTM4REVaNGU0MzUwRC9OMWlmWmRpazluOVA5VFMrZjVtRXRuSHdWa2RLb2NaQ3EwCi9uRWluajdNa3d0cTFUc1N5V0dLcGMwSVhTelNsKzRES2N0TkdLOElZL2R1TXRQN1pEU2t5bm5IU2RHelM5SHkKTkgzS0pBU1I5QXFXbVN2c3JVVnVHRjNCSjNlOUVNcXhuSmNneE9hczQvbGZUOFZLMXgyMk51VitNM09yZWdrUwoyc0d5TkFVcGF2TzVCOUQ2WDJuT3Q3amNISjdoNEYwU0lxMFBJRUxucGlKdDJYSGVUU3pKSXRiQTlpT1JSWGxwCmF0OTJ3VWJXK3FFcXFHYzlab1J6bzJyOE1SSWRrc0praG9RZXFqOUljR1JuY0RPYWxTU1NVeXB6amFwWHRjS2sKcVp3QWp4RUNnWUVBLzJiZGVWSFlNZG05WnJnVVJabURqT09BVmNMSkl3Mld5bzZiR2gwa0tLMGJNdEVWWE04dApkWjM4U2JxUGVSZHloNG1yaXhTZUJWMDVDc2xqbDQzeFhqVUZkdGpSN21pbFBUQkFxS081TXdic2FsR0htc3l0CkFlSDdoMElMRVhNcTR0UDVPYk9hWnBsVGlKRHVER1ZpYTJqVFpFQVhRb0loMzAwNE80V0UwczhDZ1lFQTA5TmIKNnRvdVhWeXVQWG1xbFZWb1ljSTNLQjJ0UVhXcUF1VSsvZkJaVDZRT280a09UMllnalltVnh1SW5jRjdqc1FMYwpEb3dHZTZJMEtvcUNzUElxa0p0S2JVYyt3c3ZEam5pa1JSNHBST1RyalM3eUQ4c0d0VjV3SDNVUjZvekpVOHp0ClVUOXZHVy9RYzNBdnZSNUZRTmhaNDJ2WCtZUHozLzNTbmRrUlRwc0NnWUJxTE94VG5EZkJlYUNvakV3NUp2bXUKWDRHaHBZbVNuZnFiR0svWUtsYzI1Y2dSMXlRSFlrV215TmZ5R3JHOGlJZmZXdGRLeVhac0NuWkZTcld4Y3B2dQpLeUtyWnJYWFkvK1ZzWEtmNlBoOEF4dlRrek5Kb2w4bUVqbEw0S1BUVEVwKzQ5cVBKMkEvMk93R01TSVZXeXlRCk5KTDA4VVA2TFRsQnFNdUF5eUFOdHdLQmdRRE1VN2Y1ZmNaMWxuNFluTERZWmM5UFpWbHhIOWROS21QNWNRcm4KUngzUGZkQUZIYUtwSWpsS2Jpb0U4NGZabEI2TVU5YlRUV3kvZTRKWWVzMkFROGlkUHI1M1ZOOE1aaU5YM2JXUApXbjJ5a0NOTFI3cUZVM0ZBS0QxOWlwN2lvalZkMlhJZUNsTnZ0UFRkTloxSER0ajhxUFZrTTFYY3dOVFEvdlZYCi85ZjYrd0tCZ1FEZ25GaGNVMXVwSjhodVZ1UHptY2t0WWh6VFVhRjhKWndFQnZlaVNhclpVWkw3bWFwdURHWEYKVmV6S2Flc3ZjVCsyeithd2ppTjdoUnBLTEwwWk51OHFTTXpFM3dTK3R0SlZLRm9PT0tqdnY3VVFxaWc2NG5xVQoyNWQyWXd1UU4xMFNBdURhS0dhcDZzRkdoZ01DaFR3OEZoTTUrbzlhUlhQMW42Y292RzNrVkE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

# Use FullyQualifiedDomainNames (FQDN) (e.g. machine1.corp.example.com)
[all:vars]
ansible_connection=ssh
ansible_user=root
[primary-master]
master.h0st.space

[secondary-masters]
# If there is only one master, make this section empty

[masters:children]
primary-master
secondary-masters

[nodes]
# If there is only one machine both master and node, make this section empty
# Best practice is to have few machines allocated for Prometheus/Ingresses/eventual ELK. 
#      These are usually labeled "infra", and tainted with NoSchedule or at least PreferNoSchedule
#      See "taint_for_label" in group_vars/all/global.yaml
#node[1:2].corp.example.com label=node-role.kubernetes.io/infra=
# All other nodes are automatically labeled "compute" and without any taint.

node[0:0].h0st.space label=node-role.kubernetes.io/infra=

[node:children]
nodes
ReSearchITEng commented 5 years ago

First hunch is that it comes from https://github.com/ReSearchITEng/kubeadm-playbook/blob/2440e5827e451a8694ce28ab744b177ff81ee181/roles/primary-master/tasks/main.yml#L308 (which is done for both admin.conf and kubelet.conf to overcome issues with http_proxy (in case is being used). BTW, are you using ipv6? This might require fix of the regex

It would be really helpful if you could skip this task (comment it out) and see if it solves the issue. Ideally, share admin.conf before being updated, so we can understand why replace goes wrong (in case this is the issue).

gallexme commented 5 years ago

im not actively using ipv6, but im sure every node has some ipv6 address assigned {{ master_name }} seems to be ok: [master.h0st.space] => { "msg": "62-210-136-24.rev.poneytelecom.eu" }

gallexme commented 5 years ago

so apperantly it tries to use Reverse DNS? instead of the fqdn set in the inventory

ReSearchITEng commented 5 years ago

wait, I see instead of master, it says: node0. Could it be that ansible is confused due to the fact that group name "node" is the same as the name of your machine 'node'?

gallexme commented 5 years ago

@ReSearchITEng sorry between the 2 comments i switched the all/network.yaml cluster_name from node0 to master

i think i found the issue

https://github.com/ansible/ansible/issues/38777

import socket
socket.gethostname()
'master.h0st.space'
 socket.getfqdn()
'62-210-136-24.rev.poneytelecom.eu'

hostname -A 62-210-136-24.rev.poneytelecom.eu master.h0st.space

ReSearchITEng commented 5 years ago

is this coming from DNS or from /etc/hosts file? If you don't have access to DNS, update the hosts file, so: hostname -s in indeed short (aka master) and hostname -f is indeed long. Maybe even remove the alias master.h0st.space.

What we could do, is to skip this replacement task when proxy is not defined... Please update when you have news.

gallexme commented 5 years ago

@ReSearchITEng it is like that already

sudo hostname -f master.h0st.space sudo hostname -s master host file


at /etc/hosts
127.0.0.1       localhost
127.0.1.1       sd-53287.dedibox.fr storage

The following lines are desirable for IPv6 capable hosts

::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters


only hostname -A returns an additional Alias to the 62-210-136-24.rev.poneytelecom.eu as first entry
which isnt from me, probably the cloud provider set the network card or so to it

Edit: trying to change ansible_fqdn to inventory_hostname
gallexme commented 5 years ago

runs way farther now until TASK [tools : labeling]

fatal: [node0.h0st.space -> master.h0st.space]: FAILED! => {"changed": true, "cmd": "kubectl label nodes node0 node-role.kubernetes.io/infra= --overwrite", "delta": "0:00:00.069388", "end": "2019-11-06 21:01:44.788673", "msg": "non-zero return code", "rc": 1, "start": "2019-11-06 21:01:44.719285", "stderr": "Error from server (NotFound): nodes \"node0\" not found", "stderr_lines": ["Error from server (NotFound): nodes \"node0\" not found"], "stdout": "", "stdout_lines": []}

uhm it tries to use the inventory_hostname_short name as node name, but they called inventory_hostname

kubectl get nodes
NAME                STATUS     ROLES    AGE    VERSION
master.h0st.space   NotReady   master   116s   v1.16.2
node0.h0st.space    NotReady   <none>   48s    v1.16.2
ReSearchITEng commented 5 years ago

Above is already fixed, get the new changes.

gallexme commented 5 years ago

as soon as im done with a installation im gonna try merging everything :| trying to document all the issues i find on the way, and hack through fixing them

ReSearchITEng commented 5 years ago

If there are no more comments, shall we close this issue?

gallexme commented 5 years ago

Once I get it to run through, but master doesn't have a connection to any node yet, cuz the different network reasons

github-actions[bot] commented 4 years ago

Stale issue message